10,000 Matching Annotations
  1. Aug 2024
    1. Reviewer #1 (Public Review):

      Summary:

      The study by Wu et al presents interesting data on bacterial cell organization, a field that is progressing now, mainly due to the advances in microscopy. Based mainly on fluorescence microscopy images, the authors aim to demonstrate that the two structures that account for bacterial motility, the chemotaxis complex and the flagella, colocalize to the same pole in Pseudomonas aeruginosa cells and to expose the regulation underlying their spatial organization and functioning.

      Strengths:

      The subject is of importance.

      Weaknesses:

      The conclusions are too strong for the presented data. The lack of statistical analysis makes this paper incomplete. The novelty of the findings is not clear.

      Major issues:

      (1) The novelty is in question since in the Abstract the authors highlight their main finding, which is that both the chemotaxis complex and the flagella localize to the same pole, as surprising. However, in the Introduction they state that "pathway-related receptors that mediate chemotaxis, as well as the flagellum are localized at the same cell pole17,18". I am not a pseudomonas researcher and from my short glance at these references, I could not tell whether they report colocalization of the two structures to the same pole. However, I trust the authors that they know the literature on the localization of the chemotaxis complex and flagella in their organism. See also major issue number 5 on the novelty regarding the involvement of c-di-GMP.

      (2) Statistics for the microscopy images, on which most conclusions in this manuscript are based, are completely missing. Given that most micrographs present one or very few cells, together with the fact that almost all conclusions depend on whether certain macromolecules are at one or two poles and whether different complexes are in the same pole, proper statistics, based on hundreds of cells in several fields, are absolutely required. Without this information, the results are anecdotal and do not support the conclusions. Due to the importance of statistics for this manuscript, strict statistical tests should be used and reported. Moreover, representative large fields with many cells should be added as supportive information.

      The problem is more pronounced when the authors make strong statements, as in lines 157-158: "The results revealed that the chemoreceptor arrays no longer grow robustly at the cell pole (Figure 2A)". Looking at the seven cells shown in Figure 2A, five of them show polar localization of the chemoreceptors. The question is then: what is the percentage of cells that show precise polar, near-polar, or mid cell localization (the three patterns shown here) in the mutant and in the wild type? Since I know that these three patterns can also be observed in WT cells, what counts is the difference, and whether it is statistically significant.

      Even for the graphs shown in Figures 3C and 3D, where the proportion of cells with obvious chemoreceptor arrays and absolute fluorescence brightness of the chemosensory array are shown, respectively, the questions that arise are: for how many individual cells these values hold and what is the significance of the difference between each two strains?

      (3) The authors conclude that "Motor structural integrity is a prerequisite for chemoreceptor self-assembly" based on the reduction in cells with chemoreceptor clusters in mutants deleted for flagellar genes, despite the proper polar localization of the chemotaxis protein CheY. They show that the level of CheY in the WT and the mutant strains is similar, based on Western blot, which in my opinion is over-exposed. "To ascertain whether it is motor integrity rather than functionality that influences the efficiency of chemosensory array assembly", they constructed a mutant deleted for the flagella stator and found that the motor is stalled while CheY behaves like in WT cells. The authors further "quantified the proportion of cells with receptor clusters and the absolute fluorescence intensity of individual clusters (Figures 3C-D)". While Figure 3DC suggests that, indeed, the flagella mutants show fewer cells with a chemotaxis complex, Figure 3D suggests that the differences in fluorescence intensity are not statistically significant.

      Since it is obvious that the regulation of both structures' production and localization is codependent, I think that it takes more than a Western blot to make such a decision.

      (4) I wonder why the authors chose to label CheY, which is the only component of the chemotaxis complex that shuttles back and forth to the base of the flagella. In any case, I think that they should strengthen their results by repeating some key experiments with labeled CheW or CheA.

      (5) The last section of the results is very problematic, regarding the rationale, the conclusions, and the novelty. As far as the rationale is concerned, I do not understand why the authors assume that "a spatial separation between the chemoreceptors and flagellar motors should not significantly impact the temporal comparison in bacterial chemotaxis". Is there any proof for that? More surprising for me was to read that "The signal transduction pathways in E. coli are relatively simple, and the chemotaxis response regulator CheY-P affects only the regulation of motor switching". There are degrees of complexity among signal transduction pathways in E. coli, but the chemotaxis seems to be ranked at the top. CheY is part of the adaptation. Perfect adaptation, as many other issues related to the chemotaxis pathway, which include the wide dynamic range, the robustness, the sensitivity, and the signal amplification (gain), are still largely unexplained. Hence, such assumptions are not justified.

      More perplexing is the novelty of the authors' documentation of the effect of the chemotaxis proteins on the c-di-GMP level. In 2013, Kulasekara et al. published a paper in eLife entitled "c-di-GMP heterogeneity is generated by the chemotaxis machinery to regulate flagellar motility". In the same year, Kulasekara published a paper entitled "Insight into a Mechanism Generating Cyclic di-GMP Heterogeneity in Pseudomonas aeruginosa". The authors did not cite these works and I wonder why.

      (6) Throughout the manuscript, the authors refer to foci of fluorescent CheY as "chemoreceptor arrays". If anything, these foci signify the chemotaxis complex, not the membrane-traversing chemoreceptors.

      Conclusions:

      The manuscript addresses an interesting subject and contains interesting, but incomplete, data.

    2. Reviewer #2 (Public Review):

      Summary:

      Here, the authors studied the molecular mechanisms by which the chemoreceptor cluster and flagella motor of Pseudomonas aeruginosa (PA) are spatially organized in the cell. They argue that FlhF is involved in localizing the receptors-motor to the cell pole, and even without FlhF, the two are colocalized. FlhF is known to cause the motor to localize to the pole in a different bacterial species, Vibrio cholera, but it is not involved in receptor localization in that bacterium. Finally, the authors argue that the functional reason for this colocalization is to insulate chemotactic signaling from other signaling pathways, such as cyclic-di-GMP signaling.

      Strengths:

      The experiments and data look to be high-quality.

      Weaknesses:

      However, the interpretations and conclusions drawn from the experimental observations are not fully justified in my opinion.

      I see two main issues with the evidence provided for the authors' claims.

      (1) Assumptions about receptor localization:

      The authors rely on YFP-tagged CheY to identify the location of the receptor cluster, but CheY is a diffusible cytoplasmic protein. In E. coli, CheY has been shown to localize at the receptor cluster, but the evidence for this in PA is less strong. The authors refer to a paper by Guvener et al 2006, which showed that CheY localizes to a cell pole, and CheA (a receptor cluster protein) also localizes to a pole, but my understanding is that colocalization of CheY and CheA was not shown. My concern is that CheY could instead localize to the motor in PA, say by binding FliM. This "null model" would explain the authors' observations, without colocalization of the receptors and motor.

      Verifying that CheY and CheA are colocalized in PA would be a very helpful experiment to address this weakness.

      (2) Argument for the functional importance of receptor-motor colocalization at the pole:

      The authors argue that colocalization of the receptors and motors at the pole is important because it could keep phosphorylated CheY, CheY-p, restricted to a small region of the cell, preventing crosstalk with other signaling pathways. Their evidence for this is that overexpressing CheY leads to higher intracellular cdG levels and cell aggregation.

      Say that the receptors and motors are colocalized at the pole. In E. coli, CheY-p rapidly diffuses through the cell. What would prevent this from occurring in PA, even with colocalization?

      Elevating CheY concentration may increase the concentration of CheY-p in the cell, but might also stress the cells in other unexpected ways. It is not so clear from this experiment that elevated CheY-p throughout the cell is the reason that they aggregate, or that this outcome is avoided by colocalizing the receptors and motor at the same pole.

      If localization of the receptor array and motor at one pole were important for keeping CheY-p levels low at the opposite pole, then we should expect cells in which the receptors and motor are not at the pole to have higher CheY-p at the opposite pole. According to the authors' argument, it seems like this should cause elevated cdG levels and aggregation in the delta flhF mutants with wild-type levels of CheY. But it does not look like this happened.

      Instead of varying CheY expression, the authors could test their hypothesis that receptor-motor colocalization at the pole is important for preventing crosstalk by measuring cdG levels in the flhF mutant, in which the motor (and maybe the receptor cluster) are no longer localized in the cell pole.

    3. Reviewer #3 (Public Review):

      Summary:

      The authors investigated the assembly and polar localization of the chemosensory cluster in P. aeruginosa. They discovered that a certain protein (FlhF) is required for the polar localization of the chemosensory cluster while a fully-assembled motor is necessary for the assembly of the cluster. They found that flagella and chemosensory clusters always co-localize in the cell; either at the cell pole in wild-type cells or randomly-located in the cell in FlhF mutant cells. They hypothesize that this co-localization is required to keep the level of another protein (CheY-P), which controls motor switching, at low levels as the presence of high levels of this protein (if the flagella and chemosensory clusters were not co-localized) is associated with high-levels of c-di-GMP and cell aggregations.

      Strengths:

      The manuscript is clearly written and straightforward. The authors applied multiple techniques to study the bacterial motility system including fluorescence light microscopy and gene editing. In general, the work enhances our understanding of the subtlety of interaction between the chemosensory cluster and the flagellar motor to regulate cell motility.

      Weaknesses:

      The major weakness in this paper is that the authors never discussed how the flagellar gene expression is controlled in P. aeruginosa. For example, in E. coli there is a transcriptional hierarchy for the flagellar genes (early, middle, and late genes, see Chilcott and Hughes, 2000). Similarly, Campylobacter and Helicobacter have a different regulatory cascade for their flagellar genes (See Lertsethtakarn, Ottemann, and Hendrixson, 2011). How does the expression of flagellar genes in P. aeruginosa compare to other species? How many classes are there for these genes? Is there a hierarchy in their expression and how does this affect the results of the FliF and FliG mutants? In other words, if FliF and FliG are in class I (as in E. coli) then their absence might affect the expression of other later flagellar genes in subsequent classes (i.e., chemosensory genes). Also, in both FliF and FliG mutants no assembly intermediates of the flagellar motor are present in the cell as FliG is required for the assembly of FliF (see Hiroyuki Terashima et al. 2020, Kaplan et al. 2019, Kaplan et al. 2022). It could be argued that when the motor is not assembled then this will affect the expression of the other genes (e.g., those of the chemosensory cluster) which might play a role in the decreased level of chemosensory clusters the authors find in these mutants.

    1. Reviewer #3 (Public Review):

      Summary:<br /> In this study, Davies and Plate set out to discover conserved host interactors of coronavirus non-structural proteins (Nsp). They used 293T cells to ectopically express flag-tagged Nsp2 and Nsp4 from five human and mouse coronaviruses, including SARS-CoV-1 and 2, and analyzed their interaction with host proteins by affinity purification mass-spectrometry (AP-MS). To confirm whether such interactors play a role in coronavirus infection, the authors measured the effects of individual knockdowns on replication of murine hepatitis virus (MHV) in mouse Delayed Brain Tumor cells. Using this approach, they identified a previously undescribed interactor of Nsp2, Malectin (Mlec), which is involved in glycoprotein processing and shows a potent pro-viral function in both MHV and SARS-CoV-2. Although the authors were unable to confirm this interaction in MHV-infected cells, they show that infection remodels many other Mlec interactions, recruiting it to the ER complex that catalyzes protein glycosylation (OST). Mlec knockdown reduced viral RNA and protein levels during MHV infection, although such effects were not limited to specific viral proteins. However, knockdown reduced the levels of five viral glycopeptides that map to Spike protein, suggesting it may be affected by Mlec.

      Strengths:<br /> This is an elegant study that uses a state-of-the-art quantitative proteomic approach to identify host proteins that play critical roles in viral infection. Instead of focusing on a single protein from a single virus, it compares the interactomes of two viral proteins from five related viruses, generating a high confidence dataset. The functional follow-ups using multiple live and reporter viruses, including MHV and CoV2 variants, convincingly depict a pro-viral role for Mlec, a protein not previously implicated in coronavirus biology.

      Weaknesses:<br /> Although a commonly used approach, AP-MS of ectopically expressed viral proteins may not accurately capture infection-related interactions. The authors observed Mlec-Nsp2 interactions in transfected 293T cells (1C) but were unable to reproduce those in mouse cells infected with MHV (3C). EIF4E2/GIGYF2, two bonafide interactors of CoV2 Nsp2 from previous studies, are listed as depleted compared to negative controls (S1D). Most other CoV2 Nsp2 interactors are also depleted by the same analysis (S1D). Previously reported MERS Nsp2 interactors, including ASCC1 and TCF25, are also not detected (S1D). Furthermore, although GIGYF2 was not identified as an interactor of MHV Nsp2/4 in human cells (S1D), its knockdown in mouse cells reduced MHV titers about 1000 fold (S4). The authors should attempt to explain these discrepancies.

      More importantly, the authors were unable to establish a direct link between Mlec and the biogenesis of any viral or host proteins, by mass-spectrometry or otherwise. Although it is clear that Mlec promotes coronavirus infection, the mechanism remains unclear. Its knockdown does not affect the proteome composition of uninfected cells (S15B), suggesting it is not required for proteome maintenance under normal conditions. The only viral glycopeptides detected during MHV infection originated from Spike (5D), although other viral proteins are also known to be glycosylated. Cells depleted for Mlec produce ~4-fold less Spike protein (4E) but no more than 2-fold less glycosylated spike peptides (5D), compounding the interpretation of Mlec effects on viral protein biogenesis. Furthermore, Spike is not essential for the pro-viral role of Mlec, given that Mlec knockdown reduces replication of SARS-CoV-2 replicons that express all viral proteins except for Spike (6A/B).

      Any of the observed effects on viral protein levels could be secondary to multiple other processes. Interventions that delay infection for any reason could lead to an imbalance of viral protein levels because Spike and other structural proteins are produced at a much higher rate than non-structural proteins due to the higher abundance of their cognate subgenomic RNAs. Similarly, the observation that Mlec depletion attenuates MHV-mediated changes to the host proteome (S15C/D) can also be attributed to indirect effects on viral replication, regardless of glycoprotein processing. In the discussion, the authors acknowledge that Mlec may indirectly affect infection through modulation of replication complex formation or ER stress, but do not offer any supporting evidence. Interestingly, plant homologs of Mlec are implicated in innate immunity, favoring a more global role for Mlec in mammalian coronavirus infections.

      Finally, the observation that both Nsp2 (3C) and Mlec (3E/F) are recruited to the OST complex during MHV infection neither support nor refute any of these alternate hypotheses, given that Mlec is known to interact with OST in uninfected cells and that Nsp2 may interact with OST as part of the full length unprocessed Orf1a, as it co-translationally translocates into the ER.

      Therefore, the main claims about the role of Mlec in coronavirus protein biogenesis are only partially supported.

    2. eLife assessment

      This is a valuable study that utilizes proteomic and genetic approaches to identify the glycoprotein quality control factor malectin as a pro-viral host protein involved in the replication of coronavirus. The evidence supporting this conclusion is solid, although additional insight into the mechanistic basis of malectin-mediated viral replication would further strengthen this study. This work will be of interest to cell biologists studying the molecular mechanisms of glycoprotein quality control and virologists studying the host-pathogen interactions.

    3. Reviewer #1 (Public Review):

      In this manuscript, the authors employ a combined proteomic and genetic approach to identify the glycoprotein QC factor malectin as an important protein involved in promoting coronavirus infection. Using proteomic approaches, they show that the non-structural protein NSP2 and malectin interact in the absence of viral infection, but not in the presence of viral infection. However, both NSP2 and malectin engage the OST complex during viral infection, with malectin also showing reduced interactions with other glycoprotein QC proteins. Malectin KD reduce replication of coronaviruses, including SARS-COV2. Collectively, these results identify Malectin as a glycoprotein QC protein involved in regulating coronavirus replication that could potentially be targeted to mitigate coronavirus replication.

      Overall, the experiments described appear well performed and the interpretations generally reflect the results. Moreover, this work identifies Malectin as an important pro-viral protein whose activity could potentially be therapeutically targeted for the broad treatment of coronavirus infection. However, there are some weaknesses in the work that, if addressed, would improve the impact of the manuscript.

      Notably, the mechanism by which malectin regulates viral replication is not well described. It is clear from the work that malectin is a pro-viral protein in the work presented, but the mechanistic basis of this activity is not pursued. Some potential mechanisms are proposed in the discussion, but the manuscript would be strengthened if additional insight was included. For example, does the UPR activated to higher levels in infected cells depleted of malectin? Do glycosylation patterns of viral (or non-viral) proteins change in malectin-depleted cells? Additional insight into this specific question would significantly improve the manuscript.

      Further, the evidence for increased interactions between OST and malectin during viral infection is fairly weak, despite being a major talking point throughout the manuscript. The reduced interactions between malectin and other glycoproteostasis QC factors is evident, but the increased interactions with OST are not well supported. I'd recommend backing off on this point throughout the text, instead, continuing to highlight the reduced interactions.

      I was also curious as to why non-structural proteins, nsp2 and nsp4, showed robust interactions with host proteins localized to both the ER and mitochondria? Do these proteins localize to different organelles or do these interactions reflect some other type of dysregulation? It would be useful to provide a bit of speculation on this point.

      Again, the overall identification of malectin as a pro-viral protein involved in the replication of multiple different coronaviruses is interesting and important, but additional insights into the mechanism of this activity would strengthen the overall impact of this work.

    4. Reviewer #2 (Public Review):

      Summary:<br /> A strong case is presented to establish that the endoplasmic reticulum carbohydrate binding protein malectin is an important factor for coronavirus propagation. Malectin was identified as a coronavirus nsp2 protein interactor using quantitative proteomics and its importance in the viral life cycle was supported by using a functional genetic screen and viral assays. Malectin binds diglucosylated proteins, an early glycoform thought to transiently exist on nascent chains shortly after translation and translocation; yet a role for malectin has previously been proposed in later quality control decisions and degradation targeting. These two observations have been difficult to reconcile temporally. In agreement with results from the Locher lab, the malectin-interactome shown here includes a number of subunits of the oligosaccharyltransferase complex (OST). These results place malectin in close proximity to both the co-translational (STT3A or OST-A) and post-translational (STT3B or OST-B) complexes. It follows that malectin knockdown was associated with coronavirus Spike protein hypoglycosylation.

      Strengths:<br /> Strengths include using multiple viruses to identify interactors of nsp2 and quantitative proteomics along with multiple viral assays to monitor the viral life cycle.

      Weaknesses:<br /> Malectin knockdown was shown to be associated with Spike protein hypoglycosylation. This was further supported by malectin interactions with the OSTs. However, no specific role of malectin in glycosylation was discussed or proposed.

      Given the likelihood that malectin plays a role in the glycosylation of heavily glycosylated proteins like Spike, it is unfortunate that only 5 glycosites on Spike were identified using the MS deamidation assay when Spike has a large number of glycans (~22 sites). The mass spec data set would also include endogenous proteins. Were any heavily glycosylated endogenous proteins hypoglycosylated in the MS analysis in Fig 5D?

      The inclusion of the nsp4 interactome and its partial characterization is a distraction from the storyline that focuses on malectin and nsp2.

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      In this manuscript, the authors employ a combined proteomic and genetic approach to identify the glycoprotein QC factor malectin as an important protein involved in promoting coronavirus infection. Using proteomic approaches, they show that the non-structural protein NSP2 and malectin interact in the absence of viral infection, but not in the presence of viral infection. However, both NSP2 and malectin engage the OST complex during viral infection, with malectin also showing reduced interactions with other glycoprotein QC proteins. Malectin KD reduce replication of coronaviruses, including SARS-COV2. Collectively, these results identify Malectin as a glycoprotein QC protein involved in regulating coronavirus replication that could potentially be targeted to mitigate coronavirus replication.

      Overall, the experiments described appear well performed and the interpretations generally reflect the results. Moreover, this work identifies Malectin as an important pro-viral protein whose activity could potentially be therapeutically targeted for the broad treatment of coronavirus infection. However, there are some weaknesses in the work that, if addressed, would improve the impact of the manuscript.

      Notably, the mechanism by which malectin regulates viral replication is not well described. It is clear from the work that malectin is a pro-viral protein in the work presented, but the mechanistic basis of this activity is not pursued. Some potential mechanisms are proposed in the discussion, but the manuscript would be strengthened if additional insight was included. For example, does the UPR activated to higher levels in infected cells depleted of malectin? Do glycosylation patterns of viral (or non-viral) proteins change in malectin-depleted cells? Additional insight into this specific question would significantly improve the manuscript.

      We concur with the reviewer that the mechanism by which Malectin regulates viral replication remains unclear. It will be worth pursuing the molecular mechanisms underlying this phenotype in future studies. Our existing proteomics data sets can potentially offer additional insight into the questions posed here. Namely, we plan to analyze levels of protein markers of the UPR and other ER stress pathways in infected cells depleted of Malectin in our existing global proteomics data set. In addition, we will attempt to compare glycosylation patterns of endogenous proteins in Malectin-depleted cells. One caveat to this will be that it may be difficult to differentiate between spontaneous chemical deamidation and enzymatic PNGase F mediated deamidation.

      Further, the evidence for increased interactions between OST and malectin during viral infection is fairly weak, despite being a major talking point throughout the manuscript. The reduced interactions between malectin and other glycoproteostasis QC factors is evident, but the increased interactions with OST are not well supported. I'd recommend backing off on this point throughout the text, instead, continuing to highlight the reduced interactions.

      We note that the fold change increase of OST interactions with malectin are small compared to the fold change decrease of other glycoproteostasis factors. If this modest increase is consistent across replicates, we believe this bolsters the claim that it is a noteworthy change. However, if not, we can modify the text as suggested to emphasize the reduced interactions.

      I was also curious as to why non-structural proteins, nsp2 and nsp4, showed robust interactions with host proteins localized to both the ER and mitochondria? Do these proteins localize to different organelles or do these interactions reflect some other type of dysregulation? It would be useful to provide a bit of speculation on this point.

      We also find these ER and mitochondrial protein interactions curious, which we initially reported on (Davies, Almasy et al. 2020 ACS Infectious Diseases). In this prior report, we found that when expressed in HEK293T cells, SARS-CoV-2 nsp2 and nsp4 have partial localization to mitochondrial-associated ER membranes (MAMs), as determined by subcellular fractionation. Given that malectin has also been shown to have MAMs localization (Carreras-Sureda, et al. 2019 Nature Cell Biology), we can insert some speculation on this in the Discussion section.

      Again, the overall identification of malectin as a pro-viral protein involved in the replication of multiple different coronaviruses is interesting and important, but additional insights into the mechanism of this activity would strengthen the overall impact of this work.

      Reviewer #2 (Public Review):

      Summary:

      A strong case is presented to establish that the endoplasmic reticulum carbohydrate binding protein malectin is an important factor for coronavirus propagation. Malectin was identified as a coronavirus nsp2 protein interactor using quantitative proteomics and its importance in the viral life cycle was supported by using a functional genetic screen and viral assays. Malectin binds diglucosylated proteins, an early glycoform thought to transiently exist on nascent chains shortly after translation and translocation; yet a role for malectin has previously been proposed in later quality control decisions and degradation targeting. These two observations have been difficult to reconcile temporally. In agreement with results from the Locher lab, the malectin-interactome shown here includes a number of subunits of the oligosaccharyltransferase complex (OST). These results place malectin in close proximity to both the co-translational (STT3A or OST-A) and post-translational (STT3B or OST-B) complexes. It follows that malectin knockdown was associated with coronavirus Spike protein hypoglycosylation.

      Strengths:

      Strengths include using multiple viruses to identify interactors of nsp2 and quantitative proteomics along with

      multiple viral assays to monitor the viral life cycle.

      Weaknesses:

      Malectin knockdown was shown to be associated with Spike protein hypoglycosylation. This was further supported by malectin interactions with the OSTs. However, no specific role of malectin in glycosylation was discussed or proposed.

      We will emphasize our hypotheses on this point in the discussion and add a summary figure to highlight the specific role of malectin.

      Given the likelihood that malectin plays a role in the glycosylation of heavily glycosylated proteins like Spike, it is unfortunate that only 5 glycosites on Spike were identified using the MS deamidation assay when Spike has a large number of glycans (~22 sites). The mass spec data set would also include endogenous proteins. Were any heavily glycosylated endogenous proteins hypoglycosylated in the MS analysis in Fig 5D?

      We plan to interrogate this question in our existing MS deamidation proteomics data set as outlined above.

      The inclusion of the nsp4 interactome and its partial characterization is a distraction from the storyline that focuses on malectin and nsp2.

      We believe the nsp4 comparative interactome and functional genomics data offers a rich resource for further functional investigation by others, if made public. While we found the malectin and nsp2 storyline the most compelling to pursue, we believe the inclusion of the nsp4 data strengthens the overall approach, in agreement with Reviewer #3’s comments.

      Reviewer #3 (Public Review):

      Summary:

      In this study, Davies and Plate set out to discover conserved host interactors of coronavirus non-structural proteins (Nsp). They used 293T cells to ectopically express flag-tagged Nsp2 and Nsp4 from five human and mouse coronaviruses, including SARS-CoV-1 and 2, and analyzed their interaction with host proteins by affinity purification mass-spectrometry (AP-MS). To confirm whether such interactors play a role in coronavirus infection, the authors measured the effects of individual knockdowns on replication of murine hepatitis virus (MHV) in mouse Delayed Brain Tumor cells. Using this approach, they identified a previously undescribed interactor of Nsp2, Malectin (Mlec), which is involved in glycoprotein processing and shows a potent pro-viral function in both MHV and SARS-CoV-2. Although the authors were unable to confirm this interaction in MHV-infected cells, they show that infection remodels many other Mlec interactions, recruiting it to the ER complex that catalyzes protein glycosylation (OST). Mlec knockdown reduced viral RNA and protein levels during MHV infection, although such effects were not limited to specific viral proteins. However, knockdown reduced the levels of five viral glycopeptides that map to Spike protein, suggesting it may be affected by Mlec.

      Strengths:

      This is an elegant study that uses a state-of-the-art quantitative proteomic approach to identify host proteins that play critical roles in viral infection. Instead of focusing on a single protein from a single virus, it compares the interactomes of two viral proteins from five related viruses, generating a high confidence dataset. The functional follow-ups using multiple live and reporter viruses, including MHV and CoV2 variants, convincingly depict a pro-viral role for Mlec, a protein not previously implicated in coronavirus biology.

      Weaknesses:

      Although a commonly used approach, AP-MS of ectopically expressed viral proteins may not accurately capture infection-related interactions. The authors observed Mlec-Nsp2 interactions in transfected 293T cells (1C) but were unable to reproduce those in mouse cells infected with MHV (3C). EIF4E2/GIGYF2, two bonafide interactors of CoV2 Nsp2 from previous studies, are listed as depleted compared to negative controls (S1D). Most other CoV2 Nsp2 interactors are also depleted by the same analysis (S1D). Previously reported MERS Nsp2 interactors, including ASCC1 and TCF25, are also not detected (S1D). Furthermore, although GIGYF2 was not identified as an interactor of MHV Nsp2/4 in human cells (S1D), its knockdown in mouse cells reduced MHV titers about 1000 fold (S4). The authors should attempt to explain these discrepancies.

      We plan to address these discrepancies with further elaboration in the text.

      More importantly, the authors were unable to establish a direct link between Mlec and the biogenesis of any viral or host proteins, by mass-spectrometry or otherwise. Although it is clear that Mlec promotes coronavirus infection, the mechanism remains unclear. Its knockdown does not affect the proteome composition of uninfected cells (S15B), suggesting it is not required for proteome maintenance under normal conditions. The only viral glycopeptides detected during MHV infection originated from Spike (5D), although other viral proteins are also known to be glycosylated. Cells depleted for Mlec produce ~4-fold less Spike protein (4E) but no more than 2-fold less glycosylated spike peptides (5D), compounding the interpretation of Mlec effects on viral protein biogenesis. Furthermore, Spike is not essential for the pro-viral role of Mlec, given that Mlec knockdown reduces replication of SARS-CoV-2 replicons that express all viral proteins except for Spike (6A/B).

      These are all important points. We plan to acknowledge some of these compounding factors in the Discussion.

      Any of the observed effects on viral protein levels could be secondary to multiple other processes. Interventions that delay infection for any reason could lead to an imbalance of viral protein levels because Spike and other structural proteins are produced at a much higher rate than non-structural proteins due to the higher abundance of their cognate subgenomic RNAs. Similarly, the observation that Mlec depletion attenuates MHV-mediated changes to the host proteome (S15C/D) can also be attributed to indirect effects on viral replication, regardless of glycoprotein processing. In the discussion, the authors acknowledge that Mlec may indirectly affect infection through modulation of replication complex formation or ER stress, but do not offer any supporting evidence. Interestingly, plant homologs of Mlec are implicated in innate immunity, favoring a more global role for Mlec in mammalian coronavirus infections.

      We plan to interrogate our existing proteomics data for signatures of ER stress in Mlec-depleted cells (as outlined above).

      Finally, the observation that both Nsp2 (3C) and Mlec (3E/F) are recruited to the OST complex during MHV infection neither support nor refute any of these alternate hypotheses, given that Mlec is known to interact with OST in uninfected cells and that Nsp2 may interact with OST as part of the full length unprocessed Orf1a, as it co-translationally translocates into the ER. Therefore, the main claims about the role of Mlec in coronavirus protein biogenesis are only partially supported.

      We plan to acknowledge this alternative hypothesis in the Discussion.

    1. eLife assessment

      This important study provides substantial technical development for neural circuit tracing in larval zebrafish, a widely used model for systems and developmental neurobiology, and the tool could greatly benefit neural circuit research by enabling a detailed investigation of circuit structure and function in a major model organism. The supporting evidence is solid, although a more detailed description of validation experiments would have increased confidence in the technique's utility. The work will interest zebrafish neurobiologists who are working on identifying novel neuronal connectivity patterns, provided that reagents generated in this study are made widely available; issues such as glial cell labeling, detailed toxicity analysis, and the impact of virus dose on tracing efficiency need further exploration to enhance the findings' applicability and robustness.

    2. Reviewer #1 (Public Review):

      EnvA-pseudotyped glycoprotein-deleted rabies virus has emerged as an essential tool for tracing monosynaptic inputs to genetically defined neuron populations in the mammalian brain. Recently, in addition to the SAD B19 rabies virus strain first described by Callaway and colleagues in 2007, the CVS N2c rabies virus strain has become popular due to its low toxicity and high trans-synaptic transfer efficiency. However, despite its widespread use in the mammalian brain, particularly in mice, the application of this cell-type-specific monosynaptic rabies tracing system in zebrafish has been limited by low labeling efficiency and high toxicity. In this manuscript, the authors aimed to develop an efficient retrograde monosynaptic rabies-mediated circuit mapping tool for larval zebrafish. Given the translucent nature of larval zebrafish, whole-brain neuronal activities can be monitored, perturbed, and recorded over time. Introducing a robust circuit mapping tool for larval zebrafish would enable researchers to simultaneously investigate the structure and function of neural circuits, which would be of significant interest to the neural circuit research community. Furthermore, the ability to track rabies-labeled cells over time in the transparent brain could enhance our understanding of the trans-synaptic retrograde tracing mechanism of the rabies virus.

      To establish an efficient rabies virus tracing system in the larval zebrafish brain, the authors conducted meticulous side-by-side experiments to determine the optimal combination of trans-expressed rabies G proteins, TVA receptors, and recombinant rabies virus strains. Consistent with observations in the mouse brain, the CVS N2c strain trans-complemented with N2cG was found to be superior to the SAD B19 combination, offering lower toxicity and higher efficiency in labeling presynaptic neurons. Additionally, the authors tested various temperatures for the larvae post-virus injection and identified 36{degree sign}C as the optimal temperature for improved virus labeling. They then validated the system in the cerebellar circuits, noting evolutionary conservation in the cerebellar structure between zebrafish and mammals. The monosynaptic inputs to Purkinje cells from granule cells were neatly confirmed through ablation experiments.

      However, there are a couple of issues that this study should address. Additionally, conducting some extra experiments could provide valuable information to the broader research field utilizing recombinant rabies viruses as retrograde tracers.

      (1) It was observed that many radial glia were labeled, which casts doubt on the specificity of trans-synaptic spread between neurons. The issues of transneuronal labeling of glial cells should be addressed and discussed in more detail. In this manuscript, the authors used a transgenic zebrafish line carrying a neuron-specific Cre-dependent reporter and EnvA-CVS N2c(dG)-Cre virus to avoid the visualization of virally infected glial cells. However, this does not solve the real issue of glial cell labeling and the possibility of a non-synaptic spread mechanism.

      In addition, wrong citations in Line 307 were made when referring to previous studies discovering the same issue of RVdG-based transneuronal labeling radial glial cells.

      "The RVdG-based transneuronal labeling of radial glial cells was commonly observed in larval zebrafish29,30".

      The cited work was conducted using vesicular stomatitis virus (VSV). A more thorough analysis and/or discussion on this topic should be included. Several key questions should be addressed:

      Does the number of labeled glial cells increase over time?<br /> Do they increase at the same rate over time as labeled neurons?<br /> Are the labeled glial cells only present around the injection site?<br /> Can the phenomenon of transneuronal labeling of radial glial cells be mitigated if the tracing is done in slightly older larvae?<br /> What is the survival rate of the infected glial cells over time?<br /> If an infected glial cell dies due to infection or gets ablated, does the rabies virus spread from the dead glial cells?<br /> If TVA and rabies G are delivered to glial cells, followed by rabies virus injection, will it lead to the infection of other glial cells or neurons?

      Answers to any of these questions could greatly benefit the broader research community.

      (2) The optimal virus tracing effect has to be achieved by raising the injected larvae at 36C. Since the routine temperature of zebrafish culture is around 28C, a more thorough characterization of the effect on the health of zebrafish should be conducted.

      (3) Given the ability of time-lapse imaging of the infected larval zebrafish brain, the system can be taken advantage of to tackle important issues of rabies virus tracing tools.<br /> a) Toxicity.<br /> The toxicity of rabies viruses is an important issue that limits their application and affects the interpretation of traced circuits. For example, if a significant proportion of starter cells die before analysis, the traced presynaptic networks cannot be reliably assigned to a "defined" population of starter cells. In this manuscript, the authors did an excellent job of characterizing the effects of different rabies strains, G proteins derived from various strains, and levels of G protein expression on starter cell survival. However, an additional parameter that should be tested is the dose of rabies virus injection. The current method section states that all rabies virus preparations were diluted to 2x10^8 infection units per ml, and 2-5 nl of virus suspension was injected near the target cells. It would be interesting to know the impact of the dose/volume of virus injection on retrograde tracing efficiency and toxicity. Would higher titers of the virus lead to more efficient labeling but stronger toxicities? What would be the optimal dose/volume to balance efficiency and toxicity? Addressing these questions would provide valuable insights and help optimize the use of rabies viruses for circuit tracing.

      b) Primary starters and secondary starters:<br /> Given that the trans-expression of TVA and G is widespread, there is the possibility of coexistence of starter cells from the initial infection (primary starters) and starter cells generated by rabies virus spreading from the primary starters to presynaptic neurons expressing G. This means that the labeled input cells could be a mixed population connected with either the primary or secondary starter cells.

      It would be immensely interesting if time-lapse imaging could be utilized to observe the appearance of such primary and secondary starter cells. Assuming there is a time difference between the initial appearance of these two populations, it may be possible to differentiate the input cells wired to these populations based on a similar temporal difference in their initial appearance. This approach could provide valuable insights into the dynamics of rabies virus spread and the connectivity of neural circuits.

    3. Reviewer #2 (Public Review):

      The study by Chen, Deng et al. aims to develop an efficient viral transneuronal tracing method that allows efficient retrograde tracing in the larval zebrafish. The authors utilize pseudotyped-rabies virus that can be targeted to specific cell types using the EnvA-TvA systems. Pseudotyped rabies virus has been used extensively in rodent models and, in recent years, has begun to be developed for use in adult zebrafish. However, compared to rodents, the efficiency of the spread in adult zebrafish is very low (~one upstream neuron labeled per starter cell). Additionally, there is limited evidence of retrograde tracing with pseudotyped rabies in the larval stage, which is the stage when most functional neural imaging studies are done in the field. In this study, the authors systematically optimized several parameters of rabies tracing, including different rabies virus strains, glycoprotein types, temperatures, expression construct designs, and elimination of glial labeling. The optimal configurations developed by the authors are up to 5-10 fold higher than more typically used configurations.

      The results are solid and support the conclusions. However, the methods should be described in more detail to allow other zebrafish researchers to apply this method in their own work.

      Additionally, some findings are presented anecdotally, i.e., without quantification or sufficient detail to allow close examinations. Lastly, there is concern that the reagents created by the authors will not be easily accessible to the zebrafish community.

      (1) The titer used in each experiment was not stated. In the methods section, it is stated that aliquots are stored at 2x10e8. Is it diluted for injection? Are all of the experiments in the manuscripts with the same titer?

      2) The age for injection is quite broad (3-5 dpf in Fig 1 and 4-6 dpf in Fig 2). Given that viral spread efficiency is usually more robust in younger animals, describing the exact injection age for each experiment is critical.

      (3) More details should be provided for the paired electrical stimulation-calcium imaging study. How many GC cells were tested? How many had corresponding PC cell responses? What is the response latency? For example, images of stimulated and recorded GCs and PCs should be shown.

      (4) It is unclear how connectivity between specific PC and GC is determined for single neuron connectivity. In other images (Figure 4C), there are usually multiple starter cells and many GCs. It was not shown that the image resolution can establish clear axon-dendritic contacts between cell pairs.

    4. Reviewer #3 (Public Review):

      Summary:

      The authors establish reagents and define experimental parameters useful for defining neurons retrograde to a neuron of interest.

      Strengths:

      A clever approach, careful optimization, novel reagents, and convincing data together lead to convincing conclusions.

      Weaknesses:

      In the current version of the manuscript, the tracing results could be better centered with respect to past work, certain methods could be presented more clearly, and other approaches worth considering.

      Appraisal/Discussion:

      Trans-neuronal tracing in the larval zebrafish preparation has lagged behind rodent models, limiting "circuit-cracking" experiments. Previous work has demonstrated that pseudotyped rabies virus-mediated tracing could work, but published data suggested that there was considerable room for optimization. The authors take a major step forward here, identifying a number of key parameters to achieve success and establishing new transgenic reagents that incorporate modern intersectional approaches. As a proof of concept, the manuscript concludes with a rough characterization of inputs to cerebellar Purkinje cells. The work will be of considerable interest to neuroscientists who use the zebrafish model.

    5. Author response:

      We are grateful to the reviewers for their insightful comments on our manuscript and are encouraged by their overall favorable assessments. For the eLife Version of Record, we will make the following revisions to address reviewers’ comments and broaden the applicability of our technique in the zebrafish research community:

      (1) We will elaborate on various facets with additional details:

      a) Experimental conditions | We will specify the transgenic background, injected plasmids, larval stage, viral type, and viral titer clearly for each related experiment.

      b) Experimental methods | We will depict in more details on how to inject the virus into a target area in larval zebrafish.

      c) Data analysis | We will provide more detailed information on the paired electrical stimulation-calcium imaging study and on identifying connected Purkinje cells and granule cells during circuit reconstruction.

      d) Discussion | We will elaborate on trans-synaptic specificity concerning glial cell labeling, toxicity related to viral dose and temperature, and the potential issue of secondary starters and multi-step circuit tracing.

      (2) We will address the issue of glial cell labeling by adding more discussion and characterization, including potential mechanisms and implications, cell distribution, labeling progress, survival, and capability for viral transmission as starter cells.

      (3) We will modify the text of the manuscript to clarify additional points raised by the reviewers.

      (4) We will provide public repositories for accessing both the items and information on zebrafish lines, plasmids, viral vectors, and reconstructed data generated in this study.

      In the end, we will submit full responses to the reviewer comments along with the revised version of the manuscript.

    1. eLife assessment

      The manuscript establishes a sophisticated mouse model for acute retinal artery occlusion (RAO) by combining unilateral pterygopalatine ophthalmic artery occlusion (UPOAO) with a silicone wire embolus and carotid artery ligation, generating ischemia-reperfusion injury upon removal of the embolus. This clinically relevant model is useful for studying the cellular and molecular mechanisms of RAO. The data overall are solid, presenting a novel tool for screening pathogenic genes and promoting further therapeutic research in RAO.

    2. Reviewer #1 (Public Review):

      Summary:

      Wang, Y. et al. used a silicone wire embolus to definitively and acutely clot the pterygopalatine ophthalmic artery in addition to carotid artery ligation to completely block blood supply to the mouse inner retina, which mimic clinical acute retinal artery occlusion. A detailed characterization of this mouse model determined the time course of inner retina degeneration and associated functional deficits, which closely mimic human patients. Whole retina transcriptome profiling and comparison revealed distinct features associated with ischemia, reperfusion, and different model mechanisms. Interestingly and importantly, this team found a sequential event including reperfusion-induced leukocyte infiltration from blood vessels, residual microglial activation, and neuroinflammation that may lead to neuronal cell death.

      Strengths:

      Clear demonstration of the surgery procedure with informative illustrations, images, and superb surgical videos.<br /> Two time points of ischemia and reperfusion were studied with convincing histological and in vivo data to demonstrate the time course of various changes in retinal neuronal cell survivals, ERG functions, and inner/outer retina thickness.<br /> The transcriptome comparison among different retinal artery occlusion models provides informative evidence to differentiate these models.<br /> The potential applications of the in vivo retinal ischemia-reperfusion model and relevant readouts demonstrated by this study will certainly inspire further investigation of the dynamic morphological and functional changes of retinal neurons and glial cell responses during disease progression and before and after treatments.

      Weaknesses:

      It would be beneficial to the manuscript and the readers if the authors could improve the English of this manuscript by correcting obvious grammar errors, eliminating many of the acronyms that are not commonly used by the field, and providing a reason why this complicated but clever surgery procedure was designed and a summary table with time course of all the morphological, functional, cellular, and transcriptome changes associated with this model.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors of this manuscript aim to develop a novel animal model to accurately simulate the retinal ischemic process in retinal artery occlusion (RAO). A unilateral pterygopalatine ophthalmic artery occlusion (UPOAO) mouse model was established using silicone wire embolization combined with carotid artery ligation. This manuscript provided data to show the changes of major classes of retinal neural cells and visual dysfunction following various durations of ischemia (30 minutes and 60 minutes) and reperfusion (3 days and 7 days) after UPOAO. Additionally, transcriptomics was utilized to investigate the transcriptional changes and elucidate changes in the pathophysiological process in the UPOAO model post-ischemia and reperfusion. Furthermore, the authors compared transcriptomic differences between the UPOAO model and other retinal ischemic-reperfusion models, including HIOP and UCCAO, and revealed unique pathological processes.

      Strengths:

      The UPOAO model represents a novel approach for studying retinal artery occlusion. The study is very comprehensive.

      Weaknesses:

      Originally, some statements were incorrect and confusing. However, the authors have made clarifications in the revised manuscript to avoid confusion.

    4. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment:

      The manuscript establishes a sophisticated mouse model for acute retinal artery occlusion (RAO) by combining unilateral pterygopalatine ophthalmic artery occlusion (UPOAO) with a silicone wire embolus and carotid artery ligation, generating ischemia-reperfusion injury upon removal of the embolus. This clinically relevant model is useful for studying the cellular and molecular mechanisms of RAO. The data overall are solid, presenting a novel tool for screening pathogenic genes and promoting further therapeutic research in RAO.

      Thank you for recognizing the sophistication and clinical relevance of our mouse model for acute retinal artery occlusion. We are grateful for your supportive feedback.

      Public reviews:

      (1) Response to Reviewer #1: 

      Summary:

      Wang, Y. et al. used a silicone wire embolus to definitively and acutely clot the pterygopalatine ophthalmic artery in addition to carotid artery ligation to completely block the blood supply to the mouse inner retina, which mimics clinical acute retinal artery occlusion. A detailed characterization of this mouse model determined the time course of inner retina degeneration and associated functional deficits, which closely mimic human patients. Whole retina transcriptome profiling and comparison revealed distinct features associated with ischemia, reperfusion, and different model mechanisms. Interestingly and importantly, this team found a sequential event including reperfusion-induced leukocyte infiltration from blood vessels, residual microglial activation, and neuroinflammation that may lead to neuronal cell death.

      Strengths:

      Clear demonstration of the surgery procedure with informative illustrations, images, and superb surgical videos.

      Two-time points of ischemia and reperfusion were studied with convincing histological and in vivo data to demonstrate the time course of various changes in retinal neuronal cell survivals, ERG functions, and inner/outer retina thickness.

      The transcriptome comparison among different retinal artery occlusion models provides informative evidence to differentiate these models.

      The potential applications of the in vivo retinal ischemia-reperfusion model and relevant readouts demonstrated by this study will certainly inspire further investigation of the dynamic morphological and functional changes of retinal neurons and glial cell responses during disease progression and before and after treatments.

      We sincerely appreciate your detailed and positive feedback. These evaluations are invaluable in highlighting the significance and impact of our work. Thank you for your thoughtful and supportive review.

      Weaknesses:

      It would be beneficial to the manuscript and the readers if the authors could improve the English of this manuscript by correcting obvious grammar errors, eliminating many of the acronyms that are not commonly used by the field, and providing a reason why this complicated but clever surgery procedure was designed and a summary table with the time course of all the morphological, functional, cellular, and transcriptome changes associated with this model.

      Thank you for your thorough review of the manuscript. We sincerely apologize for any grammatical errors resulting from our English language proficiency and have taken the necessary steps to polish the article. Additionally, we have heeded your advice and reduced the use of field-specific acronyms to enhance readability for both the manuscript and its readers.

      Regarding the rationale behind the design of the UPOAO model, we have provided a description in Introduction section. Our group focuses on the research of pathogenesis and clinical treatment for RAO. The absence of an accurate mouse model simulating the retinal ischemic process has hampered progress in developing neuroprotective agents for RAO. To better simulate the retinal ischemic process and possible ischemia-reperfusion injury following RAO, we developed a novel vascular-associated mouse model called the unilateral pterygopalatine ophthalmic artery occlusion (UPOAO) model. We drew inspiration from the widely employed middle cerebral artery occlusion (MCAO) model, commonly used in cerebral ischemic injury research, which guided the development of the UPOAO model.

      We appreciate your valuable suggestion regarding the inclusion of a summary table outlining the time course of morphological, functional, cellular, and transcriptome changes associated with this model. To address this, we intend to include a supplementary table at the end of the article (Table. S2 Summary Table), which will offer a comprehensive overview of the experimental results, thereby aiding in clarity and interpretation.

      Once again, we thank you for your insightful comments and suggestions, which have greatly contributed to the improvement of our manuscript.

      (2) Response to Reviewer #2: 

      Summary:

      The authors of this manuscript aim to develop a novel animal model to accurately simulate the retinal ischemic process in retinal artery occlusion (RAO). A unilateral pterygopalatine ophthalmic artery occlusion (UPOAO) mouse model was established using silicone wire embolization combined with carotid artery ligation. This manuscript provided data to show the changes in major classes of retinal neural cells and visual dysfunction following various durations of ischemia (30 minutes and 60 minutes) and reperfusion (3 days and 7 days) after UPOAO. Additionally, transcriptomics was utilized to investigate the transcriptional changes and elucidate changes in the pathophysiological process in the UPOAO model post-ischemia and reperfusion. Furthermore, the authors compared transcriptomic differences between the UPOAO model and other retinal ischemic-reperfusion models, including HIOP and UCCAO, and revealed unique pathological processes.

      Strengths:

      The UPOAO model represents a novel approach to studying retinal artery occlusion. The study is very comprehensive.

      We greatly appreciate your positive assessment of our work and are encouraged by your recognition of its significance.

      Weaknesses:

      Some statements are incorrect and confusing. It would be helpful to review and clarify these to ensure accuracy and improve readability.

      We sincerely appreciate your meticulous review of the manuscript. Taking into account your valuable feedback, we will thoroughly address the inaccuracies identified in the revised version. Additionally, we will commit to polishing the article to ensure improved readability. We apologize for any confusion caused by these inaccuracies and genuinely thank you for bringing them to our attention.

      Recommendations For The Authors:

      Reviewer #1:

      (1) Response to comment:

      The conclusions of this paper are mostly well supported by clear images and convincing data analysis, but some aspects of image presentation and additional data analysis may be needed to strengthen the manuscript.

      We sincerely appreciate your positive assessment of our work and your recognition of the clear images and convincing data analysis supporting our conclusions. Your constructive feedback on enhancing the clarity of our manuscript's image presentation and additional data analysis is highly valued. In response to your suggestions, we have taken steps to improve readability by removing or correcting uncommon acronyms from certain images. We have also conducted further data analysis to provide more comprehensive insights. Thank you for your guidance in improving the quality of our manuscript.

      (2) Response to recommendation (1):

      In Results 3.1 or in Method 2.2: please explain why this combination of silicone wire embolization and carotid artery ligation was chosen to replace previous models such as UCCAO? What are the advantages? And why the silicone wire embolus was inserted through ECA instead of inserting into CCA directly? The cleverly designed surgical procedure is very impressive but the reasoning behind it is not obvious and needs more explanation.

      Thank you for your valuable feedback.

      In the introduction, we briefly describe the rationale for developing the UPOAO model to simulate acute ischemia-reperfusion of retinal artery occlusion (RAO). Previous common retinal ischemia model had certain shortcomings. For example, in the HIOP model, which is often used for simulating glaucoma, the ischemic factor of interrupted retinal blood flow may be amplified due to the dual effects of IOP-induced mechanical stress [1, 2] and vascular ischemia due to normal saline perfusion in the anterior chamber. In the UCCAO model, recanalization is performed after ligation of the carotid blood vessels, and the retina communicates with the blood vessels in the brain, resulting in retinal hypoperfusion. The retina ischemia in UCCAO is a chronical process, for example, the retina became thinner at week 10 and week 15 [3], while RAO is an acute total retinal ischemic disease. Therefore, it is critically important to develop a simple mouse model that can simulate acute retinal ischemia and reperfusion injury in RAO patients.

      Various models have been developed for ischemic stroke research, with the endoluminal suture model being the most employed method for middle cerebral artery occlusion (MCAO). In this model, filaments are introduced through either the external or internal carotid artery and advanced into the middle cerebral artery, causing temporary blood flow blockage for a specific duration. This method has been extensively employed in studies involving transient occlusion [4]. Among the MCAO models, the Koizumi method (occlusion from the common carotid artery (CCA) to the middle cerebral artery (MCA)) and the Longa method (occlusion from the external carotid artery (ECA) to the MCA) are frequently used. Among these two methods, the Longa method is more widely utilized in research studies. The Longa method has a much lower mortality rate post-surgery (26%) than that of the Koizumi (44%) [5]. The MCAO model induces substantial infarct areas and significantly contributes to advancements in stroke research, including investigations into blood-brain barrier disruption and inflammatory responses to ischemia.

      RAO is considered a form of ocular stroke. Inspired by the MCAO model, we have employed a silicone wire embolus to induce acute interruption of blood flow to the retina. This approach enables the investigation of pathophysiological processes associated with RAO, providing valuable insights into the understanding of this condition. We have clarified these points in the revised manuscript (line 129).

      The reasoning behind inserting the silicone wire embolus through the ECA instead of directly into the CCA is twofold:

      (1) Convenience and avoidance of heavy bleeding and mortality. Inserting the silicone wire embolus requires creating an opening in the artery, which then needs to be ligated at both ends after the silicone wire embolus is removed to prevent excessive bleeding. The ECA's ability to form a straight line with the ICA after folding makes it more convenient for the entry and removal of the silicone wire embolus. This procedure is more convenient to perform on the ECA. The blood flow to the CCA can be restored after the plug is removed from ECA, ensuring that the blood supply to the brain through the CCA is not affected.

      (2) Preservation of reperfusion process. If the silicone wire embolus were inserted directly into the CCA, the ends of the CCA opening would need to be ligated after the silicone wire embolus is removed. This would result in a lack of reperfusion process after retinal ischemia. To enable the reperfusion process, the decision was made to open the ECA instead.

      We have clarified these points in the revised manuscript to better explain the rationale behind our methodology (line 139). Thank you for prompting this important clarification, which we believe will enhance the understanding of our readers.

      (3) Response to recommendation (2):

      Did the UPOPA actually block OA, including both the retinal (CRA) and choroidal (SPCA and LPCA) blood supply? If so, why does it seem only the inner retina was affected but not the outer retina?

      Thank you for your question. We agree with you that the UPOAO model blocks OA, which includes retinal and choroidal vessels. Our experimental results primarily indicate damage to the inner retinal layer within 7 days of reperfusion. For example, OCT and HE staining showed significant thinning of the inner retina after 60 minutes of ischemia followed by 7 days of reperfusion (Figure 4). At the same time, the b-wave amplitudes were decreases, usually indicating damage to the inner layer of the retina. However, the outer retina was seemed not affected by 60 minutes of ischemia based on the results of OCT, HE and immunofluorescence.

      Inner layer of the retina was known to show the highest sensitivity to hypoxic challenges [6], whereas the outer retinal layer was more resistant to hypoxic stress [7]. The possible reason for these results was that the outer layer like photoreceptors is more tolerant against ischemia than inner layer of the retina. Previous studies of retinal ischemia-reperfusion models supported this assumption. In the UCCAO model, the b-wave was more affected than the a-wave. Decreases in the amplitudes of OPs, scotopic b-wave, and photopic b-wave were consistently observed on week 4 after UCCAO, while the amplitude of scotopic a-wave did not dramatically change [8]. Prolonged ischemia, such as permanent ischemia, led to photoreceptor cell degradation, as seen in Stevens et al.'s report of photoreceptors loss 3 months after permanent ligation of both common carotid arteries in bilateral common carotid artery occlusion (BCCAO) [9]. In the HIOP model, the GCL and INL reacted sensitively to ischemic processes. A significant thinning of the GCL as early as 6 hours after 60 minutes of ischemia [10]. Horizontal cells and photoreceptors remained mostly unaffected, while most RGCs and several amacrine cell subtypes disappear [11, 12].

      Our study revealed the changes that occurred within 60 minutes of ischemia and the first 7 days of reperfusion in the UPOAO model. One possibility was that the ischemia duration in our model was not long enough to affect the outer retinal cells. Furthermore, the observation time point for reperfusion was not long enough to see the structure damage and visual dysfunctions in the outer retinal layer. As we have explained in the manuscript, further exploration is needed to understand changes induced by longer ischemia duration and reperfusion periods. Revealing the damage to retinal structure and function during longer ischemia time will be an emphasis direction for our further research.

      (4) Response to recommendation (3):

      Better to only use well-accepted acronyms and remove those that are rarely seen in other publications, such as IMRL, MRL, HIOP, TRT, etc.

      Thank you for your valuable feedback. In our manuscript, we utilized the Spectralis HRA+OCT device (Heidelberg) to capture the retinal images. However, the resulting image layering did not adequately distinguish each retinal layer clearly. To address this limitation, we referred to a clinical OCT stratification approach in RVO and divided the retina into the inner, middle, and outer layers [16]. We acknowledge that this hierarchical description is not commonly used and have therefore followed your recommendation to remove these rare acronyms and instead employ the layer structure abbreviation along with the plus sign. The methods and results have been revised accordingly (line 213, line 368, Figure 4 and Figure S2).

      In addition, for the HIOP model, it is also known as the IR or RIRI model [17-19], and the pathophysiological process of retinal ischemia-reperfusion injury (IRI) is usually used to represent this type of anterior chamber perfusion model. To avoid confusion between the pathophysiological process of ischemia-reperfusion studied in this paper and the common model of high intraocular pressure, we have consistently referred to it as the HIOP model, an abbreviation that is cited in many references [20-22].

      Thanks again for the suggestion. We apologize for any confusion caused by the use of abbreviations and have made the necessary corrections in the manuscript. We have also strengthened the details of OCT layering in the images to enhance readability for our audience.

      (5) Response to recommendation (4):

      Figure 3F, G: What do the OP changes mean? What retina cell dysfunction leads to OP changes? Is there RGC-relevant visual function readout to correlate with RGC death?

      Oscillatory potentials (OPs) are important components of the electroretinogram (ERG). While the precise origin of OPs remains unclear, they are generally believed to be generated from the inner retinal layer, specifically involving bipolar cells, amacrine cells and ganglion cells [23]. OPs are sensitive indicators of retinal ischemic effects and can detect dysfunction before alterations in the b-waves occur [24-26] (We have added these statements at line 358). In this research, the reduction of OPs indicated dysfunction in the inner retinal layer and retinal ischemia.

      The function of RGCs can be non-invasively assessed by using various ERG technique that emphasize the activity of inner retina neurons, including OPs of multifocal ERG (mfERG), photopic negative response (PhNR) in mfERG, pattern electroretinogram (PERG), negative Scotopic Threshold Response (nSTR) [27]. Among these indicators, the PERG appears to be more specifically related to the presence of functional RGCs. However, the complexity of electrophysiological sources and species-specific differences in RGCs characteristics should also be considered. In addition, visual evoked potentials (VEP) can assess the function of visual signaling in the whole visual pathway from RGC axons to the visual cortex of the brain [28, 29]. Unfortunately, due to the unavailability of specific equipment required for evaluating RGCs function, we encountered limitations in conducting a comprehensive assessment in this study. This limitation emphasizes the importance of future studies incorporating RGCs evaluation to provide a more comprehensive understanding of visual pathway functionality and its implications, considering indicators such as PERG and PhNR.

      Thank you for your careful review and insightful questions.

      (6) Response to recommendation (5):

      Figure 4B: RNFL/GCL/IPL normally called GCC (ganglion cell complex).

      We appreciate your helpful recommendation regarding the abbreviation GCC (ganglion cell complex) for the combination of RNFL, GCL, and IPL. We have updated this terminology in the revised manuscript (line 213 and Figure 4).

      (7) Response to recommendation (6):

      Figure 4 A-F: Normally a circular OCT image surrounding the optic nerve head is preferred to measure retina thickness. If in these figures, all the OCT images are from the same location, it may be acceptable, but need to provide imaging details on how these OCT planes are selected and what has been done to make sure the same locations were selected for comparison.

      We agree with your comment on OCT imaging that the retina is usually captured OCT images surrounding the optic nerve head. In this study, our goal was to assess both the thickness of the peripheral retina and the retina near the optic nerve head. To achieve this, we considered the optic nerve head as the apex of the selected field of view (left upper region of panel A in Figure 4). For each mouse, we obtained OCT images of the superior nasal (SN), superior temporal (ST), inferior nasal (IN), and inferior temporal (IT) fields of the optic nerve. We then averaged the thicknesses from these four fields. In each field, we measured and statistically evaluated the retinal thickness at distances of 1.5, 3, and 4.5 papillae diameters (PD) from the optic nerve head.

      This approach allowed us to ensure that the same locations were selected for comparison and provided a comprehensive assessment of retinal thickness across different regions. We have detailed this methodology in the revised manuscript to clarify the imaging process and the consistency of the selected locations.

      Thank you for your insightful feedback.

      Reviewer #2:

      Addressing the following concerns is necessary to improve the manuscript.

      (1) Response to recommendation (1):

      The manuscript contains many grammatical errors and should be carefully reviewed for corrections. For example: In the title, "Silicone Wire Embolization-induced Acute Retinal Artery Ischemia and Reperfusion Model in Mouse: Gene Expression Provide Insight into Pathological Processes". It should be "Provides" instead of "Provide". In the Abstract, "The resident microglia within the retina and peripheral leukocytes which access to the retina were pronounced increased on reperfusion periods." It should be "pronouncedly" or "markedly" instead of " pronounced".

      Thank you for your careful reading and pointing out the grammatical errors in the manuscript. We apologize for these mistakes and have since revised and polished the article with the assistance of native English speakers. Ensuring accurate and clear language usage in scientific writing is crucial, and we appreciate your help in improving the quality of our manuscript. Thank you for bringing these errors to our attention.

      (2) Response to recommendation (2):

      Video 2: the video content from "30s-47s" and "50s-67s" is repeatedly shown.

      Thank you for your careful review of the video. In the process of preparing the external carotid artery for silicone wire embolus insertion, we first ligated the distal end with a square knot and then tied a loose knot at the proximal end. In the video content from "30s-47s" and "50s-67s", we are tying a square knot. We apologize for any confusion caused by these repeated video clips.

      (3) Response to recommendation (3):

      Figure 1: The ConA staining (H-I) and FFA (J-K) were performed before the removal of silicone wire embolus. It would be beneficial to clarify this in the figure legend too. Additionally, the label 'Post. Sup. Alveolar art.: Posterior superior alveolar artery' is not present in Figure 1L."

      Thank you for your thorough review of the manuscript and the valuable suggestions regarding Figure 1. We have updated the figure legend of Figure 1 to clarify that ConA staining (H-I) and FFA (J-K) were performed before the removal of the silicone wire embolus (line 868 and line 873). Additionally, we have included the label 'Post. Sup. Alveolar art' in Figure 1L as you pointed out. We appreciate your careful attention to detail, and we have ensured that these omissions have been rectified in the revised version of the manuscript.

      (4) Response to recommendation (4):

      Figure 2: only representative images of RGCs at the peripheral retina were shown. It is not clear if only RGCs in the peripheral retina were quantified. Is there RGC loss in the central and middle retina in the UPOAO model as well? How many fields of RGCs were quantified for each retina?

      Thank you for your meticulous review of the manuscript. The quantification method of RGCs is described in detail as follows:

      Four radial incisions were made in the retina and flattened on a glass slide to create a "four-leaf clover" shape. Retina was photographed using a fluorescence microscope (BX63, Olympus, Japan). We captured images from three different regions of each retinal quadrant: 0.1 mm-0.5 mm (central region, field numbers: 1, 4, 7, 10), 0.9 mm-1.3 mm (middle region, field numbers: 2, 5, 8, 11), and 1.7 mm-2.1 mm (peripheral region, field numbers: 3, 6, 9, 12) from the optic nerve head, respectively, as shown in Author response image 1.

      Of these, the peripheral field changes were the most noticeable, so we used the Leica SP8 confocal microscope (20X) to capture peripheral field RGCs as a demonstration (Figure 2A, C, E, G). RGC counts of twelve fields of each retina were quantified and the average density of RGCs in twelve fields per retina was shown in Figure 2B, D, F, K. RGC counts in the central (field number: 1, 4, 7, 10), middle (field number: 2, 5, 8, 11), and peripheral (field number: 3, 6, 9, 12) visual fields were shown in Author response table 1-4.We have included this detailed methodology in the revised manuscript to clarify the quantification process and to address the presence of RGCs loss in both the central and middle retina in the UPOAO model. Thank you for pointing out the need for this clarification.

      Author response image 1.

      Schematic diagram of field selection. Scale bar=1.4 mm. Each retinal petal has three distinct visual fields (the area circled by the green line) that radiate from the optic nerve head to the periphery, in that order, the central, middle, and peripheral visual fields.

      Author response table 1.

      RGCs counts in each field of each retina (30-minute ischemia and 3-day reperfusion)

      Author response table 2.

      RGCs counts in each field of each retina (30-minute ischemia and 7-day reperfusion)

      Author response table 3.

      RGCs counts in each field of each retina (60-minute ischemia and 3-day reperfusion)

      Author response table 4.

      RGCs counts in each field of each retina (60-minute ischemia and 7-day reperfusion)

      (5) Response to recommendation (5):

      Figure 3: The representative wave lines in panels A (60min_3d, 60min_7d) and F do not reflect the statistical analysis presented in panels D, E, and G, especially for the amplitudes of b waves and OPs.

      Thank you for your careful review of the manuscript. We've added labels for a-waves, b-waves, and improved the presentation of OPs to make the details of the amplitude more visible (Figure 3). In the previous version, due to incorrect settings, we did not adjust the ordinate spacing when fitting curves of representative wave lines in four groups, resulting in the curves being compressed vertically to the same height. We have now adjusted the curves to be fitted under the same scale bar (shown in the bottom right corner of Figure. 3A). What’s else, we removed the baseline wave of the OPs wave and adjusted the abscissa scale to highlight the N waves and P waves for easy reading (Figure 3F).

      (6) Response to recommendation (6):

      There are two different Supplementary Figure 1 and no Supplementary Figure 3, resulting in misaligned references to Supplementary Figures 1, 2, and 3 in the text.

      Thank you for your careful review of the manuscript. We have reviewed the manuscript again and identified errors in uploading the supplementary figures, which resulted in duplicate Supplementary Figure 1 and the absence of Supplementary Figure 3. We have corrected these issues and realigned the references to Supplementary Figures 1, 2, and 3 in the text to ensure consistency. We appreciate your attention to detail and your reminder to address this issue.

      (7) Response to recommendation (7):

      There is confusion about the definition of ORL (outer retina layer). In Lines 208-209, ORL was defined as the combined thickness of the rest to the retinal pigment epithelium (RPE). It seems the ONL is included in ORL. But in lines 358-359, 907-908, "the ORL encompassed the region from the inner segment/outer segment (IS/OS) to the RPE". Please make the definition consistent. In addition, it is hard to distinguish the regions marked by the green lines in Fig. 4A (sham image) after Line 902.

      Thank you for your careful review of the manuscript. We have addressed the confusion regarding the definition of the outer retinal layer (ORL). The Heidelberg OCT device does not distinguish the layers of the mouse retina well, so we divided it into three broader layers:

      (1) Ganglion Cell Complex (GCC) layer, which encompasses RNFL+GCL+IPL.

      (2) Middle Retinal Layer, which includes INL+OPL.

      (3) Outer Retinal Layer (ORL), which includes ONL+IS/OS+RPE.

      We apologize for the inconsistency and have revised the errors in the manuscript and figure legends accordingly. Additionally, we have removed rare domain-specific acronyms and replaced them with more commonly understood abbreviations, as suggested, to avoid confusion.

      Furthermore, we have enlarged parts of the OCT images to better display the layers, hoping to meet the readers' requirements and improve clarity. Thank you for your valuable feedback.

      (8) Response to recommendation (8):

      Figure 4 (Panels H-J, L-M) incorporated with the text (Line 902) differs from the high-resolution version of Figure 4 included later in the manuscript. In Figure 4 (Panels H-J, L-M) merged with the text (Line 902), the quantification of the IPL and INL thickness is incorrect, and the scale bar is inaccurate. However in the high-resolution version of Figure 4 provided later, the thickness of the RNFL+GCL is incorrect.

      Thank you for your careful review of the manuscript. The quantification of the IPL and INL thickness in Figure 4 (Panels H-J, L-M) incorporated with the text has been revised to ensure accurate measurements and scale bars (Figure 4 and line 924). The high-resolution version of Figure 4 provided later has been updated to correct the thickness measurements of the RNFL+GCL. We have ensured that the ordinate in the high-resolution version of Figure 4 now correctly represents length units, consistent with the equal proportional conversion used in the integrated text figures.

      Thank you for your valuable feedback and for pointing out these errors. We have made the necessary corrections to align the figures accurately with the manuscript.

      (9) Response to recommendation (9):

      Line 384-386: the statement "Notably, a-waves in ERG and the thickness of the outer retinal layers in both OCT and HE remained unchanged." is not accurate, since a-waves in ERG is not changed in 3 days but changed in 7 days, and the thickness of the outer retinal layers in HE is either not measured or not shown in Figure 4.

      Thank you for your careful review of the manuscript. We apologize for this error and have revised it.

      We aimed to convey that the amplitude of the a-waves, which represent the function of the photoreceptors, does not show significant variation, which is consistent with the thickness of the outer retinal layer observed in OCT and HE images. Our results indicated that at 7 days post-injury, the amplitude of the a-waves in ERG was statistically different only at stimulus light intensity of 0.3, 3.0 and 10.0 cd.s/m2. In contrast, the b-wave amplitude was reduced by half compared to sham eyes at almost all stimulus light intensities. At the same time, the immunofluorescence staining results of photoreceptor cells showed no significant change at 7-days. Therefore, we consider the change in a-wave amplitudes were not significant compared to the significant decrease in b-wave amplitude. We have clarified this in the revised manuscript.

      We also analyzed the thickness of the outer retinal layers in HE and found it to be consistent with OCT results, showing no significant changes (shown in below Author response image 2).

      Thank you for your valuable feedback, which has helped improve the accuracy and clarity of our manuscript.

      Author response image 2.

      Thickness of OPL, ONL, IS/OS+RPE in HE staining. n=3; ns: no significance (p>0.05).

      (10) Response to recommendation (10):

      Figure 5 and Figure S3: Quantification data from different sections of the same retina should be averaged to represent one single sample (one data point) for statistical analysis. * in images of Fig. 5E, F, I, J is not defined in the figure legend. It would be easier for readers to follow if the GCL, IPL, INL, and OPL were labeled in retinal sections.

      Thank you for your careful review of the manuscript and recommendation. We have reperformed the statistical analysis and updated the results in Figure 5 and Figure S3. In the UPOAO experimental eyes, no no significant change in the number of HCs (Calbindin) was observed during the 3-days reperfusion period, while a notable reduction was observed after 7 days (Figure 5). Additionally, we have added the definition of the asterisks (*) in the figure legend to clarify their significance. We have also labeled the retinal layers, including the GCL, IPL, INL, OPL, and ONL, in the images to make it easier for readers to follow and understand the data.

      Thank you for helping us improve the clarity and accuracy of our manuscript.

      (11) Response to recommendation (11):

      Lines 407-409, the statement "which aligns with the a-waves observed in ERG (Figure 3D, E) and the changes seen in the outer retinal layers in OCT (Fig S2C, D)" is confusing. No changes were observed by OCT in Fig S2D.

      Thank you for your review and we are sorry about the confusion. The overall trend of the amplitude of the a-wave in ERG at 7-days did not change significantly, which is consistent with the immunofluorescence staining results of the photoreceptor cells. Based on these observations, we consider that the change in the amplitude of the a-wave was not significant. As you pointed out in recommendation 9,since a-waves in ERG were changed in 7-days at the stimulus light intensity of 0.3, 3.0 and 10.0 cd.s/m2, our description on the a-waves in 7-days was not accurate. We have clarified this point in the revised manuscript to ensure it accurately reflects the data presented.

      (12) Response to recommendation (12):

      In Figure S4, panel C shows lymphocyte-mediated immunity, and panel D shows leukocyte-mediated immunity. Please adjust the figure legend accordingly to reflect the figures.

      Thank you for your careful review of the manuscript. We have modified the figure legend of Figure S4.

      (13) Response to recommendation (13):

      Lines 440-442 state "These results suggested early ischemic processions such as cell migration and potential collateral vessel formation." It is not clear why and how "potential collateral vessel formation" is suggested by Figure 6 and Figure S4. Please clarify this in the text.

      Thank you for your careful review of the manuscript and we have deleted this sentence due to insufficient evidence. We have corrected this sentence: "These results suggested that in the early stage of retinal ischemic injury, leukocytes from the microvasculature may infiltrate retinal tissue. More experimental validation will be performed to confirm this hypothesis."(line 448). We will be more cautious in drawing conclusions in the future. Thank you for your reminder.

      (14) Response to recommendation (14):

      For the figure legend of Figure 6 "In each heatmap, upper box showed the top 10 up-regulated genes, and the below one showed the top 10 down-regulated genes." Is this correct? It appears that the upper box shows the top 10 down-regulated genes, and the lower box shows the top 10 up-regulated genes.

      Thank you for your careful review of the manuscript and we have modified the figure legend of Figure 6. In the heatmaps, the upper box showed the top 10 down-regulated genes, and the below one showed the top 10 up-regulated genes (line 977).

      (15) Response to recommendation (15):

      For the figure legend of Figure 7, the statement 'Data points are from retinal sections of four animals' is incorrect, as these data were obtained from whole retinas instead of retinal sections. Please revise the legend to reflect this accurately. The scale bar was absent in the images of Figure 7. Asterisk in Figure 7H and 7I was not defined.

      Thank you for your careful review of the manuscript and we have revised the errors. We have added the scale bar (Figure 7D). The white asterisks in Figure 7H and 7I indicate the activated microglial cells and we have added this definition in the legend of Figure7 (line 981).

      (16) Response to recommendation (16):

      It would be better to switch the order of Figure S7 and Figure S8 to align with their descriptions in the text.

      Thank you for your recommendation and we have switched the order of Figure S7 and Figure S8.

      (17) Response to recommendation (17):

      The gene names in Figure S8 should be written consistently with those listed in Table S1.

      Thank you for your recommendation and we have corrected the gene names.

      (18) Response to recommendation (18):

      In Figure 9, it is not clear why amacrine cells were not included in the UPOAO model, as amacrine cells were also injured as shown in Figure 5I-L.

      Thank you for your careful review of the manuscript and we have added amacrine cells in Figure 9.

      References

      (1) Yang, H., et al., The connective tissue phenotype of glaucomatous cupping in the monkey eye - Clinical and research implications. Prog Retin Eye Res, 2017. 59: p. 1-52.

      (2) Pavlatos, E., et al., Regional Deformation of the Optic Nerve Head and Peripapillary Sclera During IOP Elevation. Invest Ophthalmol Vis Sci, 2018. 59(8): p. 3779-3788.

      (3) Lee, D., et al., A mouse model of retinal hypoperfusion injury induced by unilateral common carotid artery occlusion. Experimental Eye Research, 2020. 201: p. 108275.

      (4) Barthels, D. and H. Das, Current advances in ischemic stroke research and therapies. Biochim Biophys Acta Mol Basis Dis, 2020. 1866(4): p. 165260.

      (5) Smith, H.K., et al., Critical differences between two classical surgical approaches for middle cerebral artery occlusion-induced stroke in mice. J Neurosci Methods, 2015. 249: p. 99-105.

      (6) Janáky, M., et al., Hypobaric hypoxia reduces the amplitude of oscillatory potentials in the human ERG. Doc Ophthalmol, 2007. 114(1): p. 45-51.

      (7) Tinjust, D., H. Kergoat, and J.V. Lovasik, Neuroretinal function during mild systemic hypoxia. Aviat Space Environ Med, 2002. 73(12): p. 1189-94.

      (8) Lee, D., et al., Retinal Degeneration in a Murine Model of Retinal Ischemia by Unilateral Common Carotid Artery Occlusion. Biomed Res Int, 2021. 2021: p. 7727648.

      (9) Yamamoto, H., et al., Complex neurodegeneration in retina following moderate ischemia induced by bilateral common carotid artery occlusion in Wistar rats. Exp Eye Res, 2006. 82(5): p. 767-79.

      (10) Palmhof, M., et al., From Ganglion Cell to Photoreceptor Layer: Timeline of Deterioration in a Rat Ischemia/Reperfusion Model. Front Cell Neurosci, 2019. 13: p. 174.

      (11) Adachi, M., et al., High intraocular pressure-induced ischemia and reperfusion injury in the optic nerve and retina in rats. Graefes Arch Clin Exp Ophthalmol, 1996. 234(7): p. 445-51.

      (12) Jehle, T., et al., Quantification of ischemic damage in the rat retina: a comparative study using evoked potentials, electroretinography, and histology. Invest Ophthalmol Vis Sci, 2008. 49(3): p. 1056-64.

      (13) Hayreh, S.S., H.E. Kolder, and T.A. Weingeist, Central retinal artery occlusion and retinal tolerance time. Ophthalmology, 1980. 87(1): p. 75-8.

      (14) Luo, X., et al., Hypoglycemia induces general neuronal death, whereas hypoxia and glutamate transport blockade lead to selective retinal ganglion cell death in vitro. Invest Ophthalmol Vis Sci, 2001. 42(11): p. 2695-705.

      (15) Schmid, H., et al., Loss of inner retinal neurons after retinal ischemia in rats. Invest Ophthalmol Vis Sci, 2014. 55(4): p. 2777-87.

      (16) Furashova, O. and E. Matthè, Hyperreflectivity of Inner Retinal Layers as a Quantitative Parameter of Ischemic Damage in Acute Retinal Vein Occlusion (RVO): An Optical Coherence Tomography Study. Clin Ophthalmol, 2020. 14: p. 2453-2462.

      (17) Pang, Y., et al., CD38 Deficiency Protects Mouse Retinal Ganglion Cells Through Activating the NAD+/Sirt1 Pathway in Ischemia-Reperfusion and Optic Nerve Crush Models. Invest Ophthalmol Vis Sci, 2024. 65(5): p. 36.

      (18) Feng, Y., et al., GSK840 Alleviates Retinal Neuronal Injury by Inhibiting RIPK3/MLKL-Mediated RGC Necroptosis After Ischemia/Reperfusion. Invest Ophthalmol Vis Sci, 2023. 64(14): p. 42.

      (19) Zeng, S., et al., CREG Protects Retinal Ganglion Cells loss and Retinal Function Impairment Against ischemia-reperfusion Injury in mice via Akt Signaling Pathway. Mol Neurobiol, 2023. 60(10): p. 6018-6028.

      (20) Rosenbaum, D.M., et al., The role of the p53 protein in the selective vulnerability of the inner retina to transient ischemia. Invest Ophthalmol Vis Sci, 1998. 39(11): p. 2132-9.

      (21) Zhang, Y., et al., Melatonin Alleviates Pyroptosis of Retinal Neurons Following Acute Intraocular Hypertension. CNS Neurol Disord Drug Targets, 2021. 20(3): p. 285-297.

      (22) Zhu, J., et al., Protective effects of Erigeron breviscapus Hand.- Mazz. (EBHM) extract in retinal neurodegeneration models. Mol Vis, 2018. 24: p. 315-325.

      (23) Wachtmeister, L., Oscillatory potentials in the retina: what do they reveal. Prog Retin Eye Res, 1998. 17(4): p. 485-521.

      (24) Cao, W., et al., Dextromethorphan attenuates the effects of ischemia on rabbit electroretinographic oscillatory potentials. Documenta Ophthalmologica, 1993. 84(3): p. 247-256.

      (25) Xu, J., et al., Pregabalin Mediates Retinal Ganglion Cell Survival From Retinal Ischemia/Reperfusion Injury Via the Akt/GSK3β/β-Catenin Signaling Pathway. Invest Ophthalmol Vis Sci, 2022. 63(12): p. 7.

      (26)Takács, B., et al., Electroretinographical Analysis of the Effect of BGP-15 in Eyedrops for Compensating Global Ischemia-Reperfusion in the Eyes of Sprague Dawley Rats. Biomedicines, 2024. 12(3).

      (27) Porciatti, V., Electrophysiological assessment of retinal ganglion cell function. Exp Eye Res, 2015. 141: p. 164-70.

      (28) Ridder, W.H. and S. Nusinowitz, The visual evoked potential in the mouse—Origins and response characteristics. Vision Research, 2006. 46(6): p. 902-913.

      (29) Liu, S., et al., An optimized procedure to record visual evoked potential in mice. Exp Eye Res, 2022. 218: p. 109011.

    1. eLife assessment

      The work investigates mechanisms necessary and sufficient for initiating tissue bending in the Cellular Potts Model. The authors emphasize how differences in implicit model assumptions, such as different constraints on cell shape change and cell rearrangement, may explain different outcomes in Cellular Potts Model and Vertex Model simulations. Despite incomplete evidence supporting the major claims due to a rather coarse-grained exploration of the model, the findings are valuable for the biophysics and computational biology communities, and cautions toward greater care in interpretation of model results.

    2. Reviewer #1 (Public Review):

      Summary:

      Satoshi Yamashita et al., investigate the physical mechanisms driving tissue bending using the cellular Potts Model, starting from a planar cellular monolayer. They argue that apical length-independent tension control alone cannot explain bending phenomena in the cellular Potts Model, contrasting with previous works, particularly Vertex Models. They conclude that an apical elastic term, with zero rest value (due to endocytosis/exocytosis), is necessary to achieve apical constriction, and that tissue bending can be enhanced by adding a supracellular myosin cable. Additionally, a very high apical elastic constant promotes planar tissue configurations, opposing bending.

      Strengths:

      - The finding of the required mechanisms for tissue bending in the cellular Potts Model provides a natural alternative for studying bending processes in situations with highly curved cells.<br /> - Despite viewing cellular delamination as an undesired outcome in this particular manuscript, the model's capability to naturally allow T1 events might prove useful for studying cell mechanics during out-of-plane extrusion.

      Weaknesses:

      - The authors claim that the cellular Potts Model (CPM) is unable to achieve the results of the vertex model (VM) simulations due to naturally non-straight cellular junctions in the CPM versus the VM. The lack of a substantial comparison undermines this assertion. None of the references mentioned in the manuscript are from a work using vertex model with straight cellular junctions, simulating apical constriction purely by a enhancing a length-independent apical tension. Sherrard et al and Pérez-González et al. use 2D and 3D Vertex Models, respectively, with a "contractility" force driving apical constriction. However, their models allow cell curvature. Both references suggest that the cell side flexibility of the CPM shouldn't be the main issue of the "contractility model" for apical constriction.<br /> - The myosin cable is assumed to encircle the invaginated cells. Therefore, it is not clear why the force acts over the entire system (even when decreasing towards the center), and not locally in the contour of the group of cells under constriction. The specific form of the associated potential is missing. It is unclear how dependent the results of the manuscript are on these not-well-motivated and model-specific rules for the myosin cable.<br /> - The authors are using different names than the conventional ones for the energy terms. Their current attempt to clarify what is usually done in other works might lead to further confusion.

    3. Reviewer #2 (Public Review):

      Summary:

      In their work, the Authors study local mechanics in an invaginating epithelial tissue. The work, which is mostly computational, relies on the Cellular Potts model. The main result shows that an increased apical "contractility" is not sufficient to properly drive apical constriction and subsequent tissue invagination. The Authors propose an alternative model, where they consider an alternative driver, namely the "apical surface elasticity".

      Strengths:

      It is surprising that despite the fact that apical constriction and tissue invagination are probably most studied processes in tissue morphogenesis, the underlying physical mechanisms are still not entirely understood. This work supports this notion by showing that simply increasing apical tension is perhaps not sufficient to locally constrict and invaginate a tissue.

      Weaknesses:

      Although the Authors have improved and clarified certain aspects of their results as suggested by the Reviewers, the presentation still mostly relies on showing simulation snapshots. Snapshots can be useful, but when there are too many, the results are hard to read. The manuscript would benefit from more quantitative plots like phase diagrams etc.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Satoshi Yamashita et al., investigate the physical mechanisms driving tissue bending using the cellular Potts Model, starting from a planar cellular monolayer. They argue that apical length-independent tension control alone cannot explain bending phenomena in the cellular Potts Model, contrasting with the vertex model. However, the evidence supporting this claim is incomplete. They conclude that an apical elastic term, with zero rest value (due to endocytosis/exocytosis), is necessary in constricting cells and that tissue bending can be enhanced by adding a supracellular myosin cable. Notably, a very high apical elastic constant promotes planar tissue configurations, opposing bending.

      Strengths:

      - The finding of the required mechanisms for tissue bending in the cellular Potts Model provides a more natural alternative for studying bending processes in situations with highly curved cells.

      - Despite viewing cellular delamination as an undesired outcome in this particular manuscript, the model's capability to naturally allow T1 events might prove useful for studying cell mechanics during out-of-plane extrusion.

      We thank the reviewer for the careful comments and insightful suggestions.

      Weaknesses:

      - The authors claim that the cellular Potts Model is unable to obtain the vertex model simulation results, but the lack of a substantial comparison undermines this assertion. No references are provided with vertex model simulations, employing similar setups and rules, and explaining tissue bending solely through an increase in a length-independent apical tension.

      Studies cited in a previous paragraph included the simulations employing the increased length-independent apical tension. For the sake of clarity, we added the citation to them as below.

      P4L174: “In contrast to the simulations in the preceding studies (Sherrard et al., 2010; Conte et al., 2012; Perez-Mockus et al., 2017; Pérez-González et al., 2021), our simulations could not reproduce the apical constriction”.

      We did not copy the parameters of the vertex models in the preceding studies because we also found that the apical, lateral, and basal surface tensions must be balanced otherwise the epithelial cell could not maintain the integrity (Figure 1—figure supplement 1), while the ratio was outside of the suitable range in the preceding studies.

      - The apparent disparity between the two models is attributed to straight versus curved cellular junctions, with cells with a curved lateral junction achieving lower minimum energies at steady-state. However, a critical discussion on the impact of T1 events, allowing cellular delamination, is absent. Note that some of the cited vertex model works do not allow T1 events while allowing curvature.

      We appreciate the comment and added it to the discussion as suggested.

      P12L301: “Even when the vertex model allowed the curved lateral surface, the model did not assume the cells to be rearranged and change neighbors, limiting the cell delamination (Pérez-González et al., 2021).”

      P12L311: “Note that the vertex model could also be extended to incorporate the curved edges and rearrangement of the cells by specifically programming them, and would reproduce the cell delamination. That is, we could find the importance of the balanced pressure because the cellular Potts model intrinscally included a high degree of freedom for the cell shape, the cell rearrangement, and the fluctuation.”

      - The suggested mechanism for inducing tissue bending in the cellular Potts Model, involving an apical elastic term, has been utilized in earlier studies, including a cited vertex model paper (Polyakov 2014). Consequently, the physical concept behind this implementation is not novel and warrants discussion.

      The reviewer is correct but Polyakov et al. assumed “that the cytoskeletal components lining the inside membrane surfaces of the cells provide these surfaces with springlike elastic properties” without justification. We assumed that the myosin activity generated not the elasticity but the contractility based on Labouesse et al. (2015), and expected that the surface elasticity corresponded with the membrane elasticity. Also, in the physical concept, we clarified how the contractility and the elasticity differently deformed the cells and tissue, and demonstrated why the elasticity was important for the apical constriction. We added it to the discussion as below.

      P12L316: “In the preceding studies, the apically localized myosin was assumed to generate either the contractile force (Sherrard et al., 2010; Conte et al., 2012; Perez-Mockus et al., 2017; Pérez-Vonzález et al., 2021) or the elastic force (Polyakov et al., 2014; Inoue et al., 2016; Nematbakhsh et al., 2020). However, the limited cell shape in the vertex model made them similar in terms of the energy change during the apical constriction, i.e., the effective force to decrease the apical surface. In this study, we showed that the contractile force and the elastic force differently deformed the cells and tissue, and demonstrated why and how the elasticity was important for the apical constriction.”

      - The absence of information on parameter values, initial condition creation, and boundary conditions in the manuscript hinders reproducibility. Additionally, the explanation for the chosen values and their unit conversion is lacking.

      We agree with the comment.

      For the initial configuration, we added an explanation to Tissue deformation by increased apical contractility with cellular Potts model section in the Results as below.

      P4L170: “A simulation started from a flat monolayer of cells beneath the apical ECM, and was continued until resulting deformation of cells and tissue could be evaluated for success of failure of reproducing the apical constriction.”

      For the parameter values we added a section “Parameters for the simulations” in the Methods.

      For the parameters unit conversion, we did not measure the surface tension and cell pressure in an actual tissue and thus could not compare the parameters to the actual forces. Instead, we varied the parameters and demonstrated that the apical constriction was reproduced with the wide range of the parameter values. We added it to the discussion as below.

      P12L310: “It succeeded with a wide range of parameter values, indicating a robustness of the model.”

      Reviewer #2 (Public Review):

      Summary:

      In their work, the authors study local mechanics in an invaginating epithelial tissue. The mostly computational work relies on the Cellular Potts model. The main result shows that an increased apical "contractility" is not sufficient to properly drive apical constriction and subsequent tissue invagination. The authors propose an alternative model, where they consider an alternative driver, namely the "apical surface elasticity".

      Strengths:

      It is surprising that despite the fact that apical constriction and tissue invagination are probably most studied processes in tissue morphogenesis, the underlying physical mechanisms are still not entirely understood. This work supports this notion by showing that simply increasing apical tension is perhaps not sufficient to locally constrict and invaginate a tissue.

      We thank the reviewer for recognizing the importance and novelty of our work.

      Weaknesses:

      The findings and claims in the manuscript are only partially supported. With the computational methodology for studying tissue mechanics being so well developed in the field, the authors could probably have done a more thorough job of supporting the main findings of their work.

      We thank the reviewer for the careful assessment and suggestions. However our simulation was computationally expensive, modeling the epithelium in an analytically calculable expression requires a lot of work, and it is beyond the scope of the present study.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Reference line 648: Correct the author's name (Pérez-González).

      We thank the reviewer and corrected the reference.

      (2) "Pale" colors are challenging to discern.

      We updated the figures.

      (3) Figure 1j: What does the yellow color in the cellular junction represent?

      We used the apical lateral site colored yellow in Fig. 1e-f’ to simulate the effect of the adherens junction. We updated the figure legend.

      (4) Figure 2c - left: Why is there a red apical junction?

      Our simulation model marked the apical junction in the initial configuration and updated the marking based on connectedness to surrounding other site marked as apical in the same cell. But when a cell was once delaminated and lost its apical junction, any surface site not adjacent to other epithelial cells were marked as basal junction because they were not adjacent to the apical junction.

      We added it to Cellular Potts model with partial surface elasticity section in the Methods as below.

      P17L430: “To simulate the differential phyisical properties of the apical, lateral, and basal surfaces, the subcellular locations are marked automatically, and the marking is updated during the simulation. In each cell, sites adjacent to different cells but not to the medium are marked as lateral.

      At the initial configuration, sites adjacent to the apical ECM are marked as apical, and during the simulation, sites adjacent to medium and other apical sites in the same cell are marked as apical.

      Rest of sites which are adjacent to medium but not marked as apical are marked as basal.

      Therefore, once a cell is delaminated and loses its apical surface, afterwards all sites in the cell adjacent to the medium are marked as basal even if it is adjacent to the apical ECM or the outer body fluid.”

      (5) Figure 4a: The snapshots are not in a steady state but in the middle of deformation. Is the time the same for all snapshots? The motivation to change P_0a is related to endocytosis. However, this could be achieved by decreasing P_0a to a non-zero value. Here, in the more drastic limit, the depth (a measure of bending) is very slight, approximately half of a cell size. What physically limits further invagination? Is it the number of cells or the range of parameters under study?

      The time length was the same for simulations in each figure, and we add it to Parameters for the simulations section in Method as below.

      P18L466: “In each figure, snapshots of the simulations show deformation by the same time length unless specified.”

      For P_0a, the reviewer is correct and the iterated ratcheting may decrease P_0a step by step instead of making it 0 immediately. Still, with P_a0 >0, the energy function and its derivative are both increasing with respect to the apical width as long as P_a > P_a0, and thus the apical shrinkage would be synchronized, even though the deformation would be smaller. We also run simulations by decreasing P_0a to 0.6 times the initial P_a, and observed smaller deformation as expected. On the other hand, the non-zero P_0a made the invagination deeper when it was combined with the effect of surrounding supracellular myosin cable, maybe due to a resistance of the apical surface against compression. One of the novel and important finding in this study is the synergetic effect of the elasticity-based apical constriction and the surrounding supracellular myosin cable. To demonstrate that the deep invagination was not due to the apical surface resistance against the compression, we showed the simulations with P_a0 = 0.

      For the conditions for further invagination, it may include the number of cells, a ratio between the cell height and width (Figure 5—figure supplement 1), interaction with ECM (Figure 5—figure supplement 2), etc. For the parameter, there might be an upper limit (Figure 4). We did not test the number of cells because of its computational cost. Among the conditions we tested, we found the planar compression by surrounding supracellular myosin the most influential rather than the mechanical property of apically constricting cells themselves.

      How each condition and parameter contributes to the invagination shall be studied in future. We added it to the conclusion as below.

      P15L395: “The depth, curvature, and speed of the invagination might be influenced by the cell shape, configuration, and parameters, and how each condition contributes to the invagination shall be studied in future.”

      (6) Figure 6b: What does the cell-surface color represent? If the idea was to represent junction tension, it would be clearer to color the junctions only.

      The junction tension may vary differently in different situations. For example, T1 transition is accompanied by enriched myosin along a shrinking cell-cell junction, and the junction bears higher tension, but other junctions of the same cell do not and thus the cell does not decrease its apical surface. In chick embryo neural tube closure, the junction tension is also polarized, and the cells shrink the apical surface along medial-lateral axis, driving the apical constriction (Nishimura et al., 2012, doi:10.1016/j.cell.2012.04.021). In the case of Drosophila embryo tracheal invagination, the cells shrank their apical surface isotropically (Figure 6a). If the junction tension was responsible for the shrinkage, all junctions of the cell must bear higher tension. Based on this assumption, the junction tension was averaged in each cell to check if the tracheal cells bore the higher average tension than surrounding cells.

      We also plotted stress tensor and calculated nematic order to check if there was radial or encircling tension alignment in the tracheal pit, but there was not.

      (7) Figure 6c: What does the junction color represent here?

      The junction color represent the relative junctional tension. We updated the figure legend.

      (8) Figure 6d-e: It is challenging to understand which error bar corresponds to each dataset.

      We updated the figure.

      (9) What is the definition of relative pressure?

      The geometrical tension inference method assumes that the tissue is in mechanical equilibrium and a sum of the junctional tensions and cell pressures pulling/pushing a vertex (tricellular junction) is 0. Therefore the calculated tensions and pressures are proportional to each other but not absolute values. We added it to the 3D Bayesian tension inference section of Methods as below.

      P24L567: “Since Equation 13 and Equation 14 only evaluate the balance among the forces, it cannot estimate an absolute value but a relative value of the tension and pressure.”

      (10) In the main text, it is mentioned that a large Es (apical elastic constant) leads to flat surfaces, avoiding bending, but the abstract says "strong apical surface tension," which, according to the rest of the text, would seem to be J_apical. Clarification is needed.

      The surface tension includes both of the surface contractility and the surface elasticity.

      We added it to Extended cellular Potts model to simulate epithelial deformations section in the Results as below.

      P3L122: “Note that in some studies the tension and the contractility are considered as equivalent, but they are distinguished in this study.”

      and

      P4L151: “The energy H included only the terms of the contact energy (Equation 1) and the area constraint (Equation 5), but the surface elasticity (Equation 2) nor (Equation 3) was not included, and thus the surface tension was determined by the contact energy.”

      Reviewer #2 (Recommendations For The Authors):

      (1) The model used is rather specific and it is rather confusing whether the issue is in the methodology or fundamental biophysics of apical constriction. For instance, one of the main narratives of the manuscript is that the Cellular Potts model better predicts apical constriction and tissue invagination than the vertex model. As I understand it, and as the authors state in p7 (line 210), "the difference between the vertex model and the cellular Potts model results was due to the straight lateral surface...". I assume that if apical constriction and tissue invagination were modelled with a vertex model with curved edges, while also allowing for cell rearrangements out of the tissue plane (some sort of epithelium-to-mesenchyme transition), the vertex model would yield exactly the same results as in the authors' cellular Potts model. If my understanding is correct, the authors should change the narrative of their manuscript and focus more on the comparison of a model with flat vs. curved edges, with "contractility" vs. "surface elasticity", with patterned apical contractility vs. non-patterned contractility (see my comment in point 2 below)... and not on comparison between CPM and VM.

      We appreciate the comments. The reviewers is correct that the vertex model can include the curved edges and the cell rearrangement, and it would reproduce the result of our cellular Potts model simulations. For the cellular Potts model, there was no need to specifically design how much the cell surface could be curved in a large arc, zigzag, or other shape, and that enabled us to find the conditions of delamination and bending.

      We added it to the discussion as below.

      P12L311: “Note that the vertex model could also be extended to incorporate the curved edges and rearrangement of the cells by specifically programming them, and would reproduce the cell delamination. That is, we could find the importance of the balanced pressure because the cellular Pott’s model intrinscally included a high degree of freedom for the cell shape, the cell rearrangement, and the fluctuation.”

      (2) About physics... and I think this is a really important point: one of the observations in the model was that in the "contractilty" model, only "edge cells" shrank its apical surface, while inner cells remained quadrilateral. Related to this, the authors say that one of the requirements for proper apical constriction is a mechanism that "simulataneously shrinks the apical surface among cells in a cluster". What would happen if the authors assumed patterned contractility, meaning that cells in the center of the cluster would be most apically-contractile, while those further away from the center, would not be contractile? Features like this were investigated in studies of ventral-furrow invagination [see, for instance, Spahn and Reuater PLOS ONE (2013) and Rauzi et al. Nat Commun (2015)-Fig. S13d].

      We thank the reviewer for the critical comment, and ran simulations with the patterned apical contractility. The apical contractility following a gradient of parabola shape succeeded in the simultaneous apical shrinkage. However, it was weak against fluctuations and the cells were delaminated by chance.

      We added it to Apical constriction by modified apical elasticity section in the result as below.

      P9L252: “We also tested another model for the simultaneous apical shrinkage, a gradient contractility model (Spahn and Reuter, 2013; Rauzi et al., 2015). If the inner cells bear higher apical surface contractility than the edge cells, that inner cells may shrink their apical surface. To synchronize the apical shrinkage, the apical contractility must follow a parabola shape gradient. Even though the gradient contractility enabled the cells to shrink the apical surface simultaneously, often some of the cells shrank faster than neighbors and were delaminated by chance (Figure 4—figure Supplement 1).”

      (3) The quality of the figures should be improved. Especially, Figure 3 and the related explanation in lines 183-192. This explanation is way too complicated and it is not clear what Figure 3c shows. For instance: if the arrows are indeed showing contractile forces (as written in the caption) then they are not illustrated correctly, but should be tangential to the cell membrane.

      We updated the figure.

      (4) The figures mostly show steady-state cross-sections from simulations. I miss a more dedicated study with model parameters being varied through wider ranges and some phase diagrams being shown etc. Also, some results could probably be supported by analytic calculations. For instance, the condition for stability (discussed in p4 lines 145-151), cells' preferred aspect ratio, cells' preferred "wedgeness" i.e., local curvature etc... I am sure some of these, if not all, could be calculated analytically and then these analytic results could help to interpret the phase diagrams.

      For the simulation results shown in the figures, we were not sure if the simulations results were in a steady state or not. We added it to Tissue deformation by increased apical contractility simulated with cellular Potts model section in the Results as below.

      P4L170: “A simulation started from a flat monolayer of cells beneath the apical ECM, and was continued until resulting deformation of cells and tissue could be evaluated for success of failure of reproducing the apical constriction.”

      For the ranges of parameters, we ran the simulation in wider range and showed results from sub-range. We added it to Parameters for the simulations section in Methods as below.

      P18L464: “The parameters were varied in a range, and the figures showed simulations with parameter values within a sub-range so that the results showed both success and failure in a development of interest.”

      For the analytical calculations, the Figure 3f shows a kind of phase diagram for shapes of a single cell. To clarify this, we rephrased “map of cell shapes” to “Phase diagram of cell shapes” in the figure legend, and added an explanation to the Results section as below.

      P6L207: “For the analysis of the cell shape in motion, we plotted a phase diagram for shapes of a single cell (Figure 3f).”

      For the analytical evaluation of the cellular Potts model simulations, there was a study doing similar but it concerned a cell of isotropic shape in a steady state (Magno et al., 2015, doi:10.1186/s13628-015-0022-x). Also, our simulation framework is computationally expensive and we could not vary the parameters in fine resolution. Therefore we could not include it in this study.

      (5) I am not sure about the terminology "contractility" vs. "elasticity". In Farhadifar et al. (2007) "contractility" is described by a squared apical-perimeter energy term, while in this work, the authors describe it by a surface-energy-like term.

      In general, elasticity is the ability of a material to resist against deformation and to return to its original shape/size. In Farhadifar et al. (2007), the cell apical area was assigned the area elasticity in this meaning. For the contractility, it is the ability to decrease the size/length, and thus it could be either expressed in linear or quadratic dependent on the modeling. In this study, we assumed cell-cell/cell-ECM adhesion and myosin activity to generate the surface contractility, and thus employed the linear expression. In Farhadifar et al. (2007) it was described as a line tension.

      We used the terms surface ‘elasticity’ and ‘contractility’ as distinctive elements composing the surface ‘tension’. We added it Extended cellular Potts model to simulate epithelial deformations section in the Results as below.

      P3L122: “Note that in some studies the tension and the contractility are considered as equivalent, but they are distinguished in this study.”

      (6) It is not entirely clear what are apical, basal, lateral, and cell "perimeters". This is a 2D model, so I assume all P-s are in fact interface lengths. In either case, this needs to be explained more clearly.

      We updated the explanation in Extended cellular Potts model to simulate epithelial deformations section in the Results as below.

      P3L111: “The cell's perimeter was partitioned automatically based on adjacency with other cells, and it was marked as apical, lateral, basal. Also, apico-lateral sites were marked as a location for the adherens junction. This cell representation also cast the vertical section of the cell. Therefore an area of the cell corresponded with a body of the cell, and a perimeter of the cell corresponded with the cell surface. Likewise the apical, lateral, and basal parts of the perimeter corresponded with the apical surface, cell-cell interface, and the basal surface of the cell respectively.”

      (7) The term H_{mc} is not clear at all. Why is this term called potential energy? What is U(i)? What is the exact biophysical interpretation of this term in 2D vs 3D?

      In 3D, the supracellular myosin cable is formed encircling the cells deformed by the apical constriction. Shrinking of the supracellular myosin cable makes the circle small, and it moves the cable toward the center of the circle. To simulate this motion of the supracellular myosin cable in the 2D cross section, we assigned the force exerted on the adherens junction of the boundary cells pulling toward the center, and because the force is relative to the position of the adherens junction and the center, it was expressed by the potential energy in the simulation.

      We updated Extended cellular Potts model to simulate epithelial deformation section in Results and Cellular Potts model with potential energy section in Methods as below.

      P4L140: “The potential energy was defined by a scalar field which made a horizontal gradient decreasing toward the center,”

      and

      P17L449: “In 3D, tension on a circular actomyosin cable would shrink the circle, and the shrinkage would pull the cable toward the center of the circle. In 2D cross section, the cable is pulled horizontally toward the middle line.”

      (8) Highten->increased

      We updated the text.

      (9) "It seems natural to consider that the myosin generates a force proportional to its density but not to the surface width nor the strain". This sentence should be supported by a reference. Also, if the force is proportional to myosin density, then it must depend on surface width, since density, I assume, is the number of motors per area.

      For the myosin density and generated force, in all preceding studies cited in this manuscript and others in the extent of our knowledge, the myosin and actin filaments density visualized by staining or labeling had been assumed relevant to the generated contractility without references. Therefore it might be well established and shared assumption.

      For the independence from the surface width and strain, the review comment is correct, but the results would be the same. If we presumed that the number of motors on the apical surface was constant in a cell during the apical constriction, then the density would increase when the apical surface was contracted, and thus it would make the apical contractility more unbalanced and promote the delamination. We added it to the results and discussion as below.

      P4L166: “For the sake of simplicity, we ignored an effect of the constriction on the apical myosin density, and discussed it later.”

      P14L328: “In our model, for the sake of simplicity, we ignored an effect of the constriction on the apical myosin density. If we presumed that the apical myosin would be condensed by the shrinkage of the apical surface, it would increase the apical tension in the shrinking cell and is expected to promote the cell delamination further. Therefore it would not change the results.”

      Reviewing Editor (Recommendations For The Authors):

      Please note also the following excerpts from discussions amongst the reviewers and the Reviewing Editor:

      Regarding Reviewer #2's Point 2:

      I believe the authors have assumed patterned contractility in their simulations, and this is shown by the "pale blue" cell color (see also lines 162-163). However, as Reviewer #2 points out in their point 2), the pale colors are very hard to see and therefore easy to miss.

      We updated figure coloring and also add the gradient pattern of contractility.

      Regarding Reviewer #2's point 5:

      It is indeed unconventional to call the "J" terms contractility, they are usually called contact energy or adhesive energy.

      In this study, we included both of the contact energy of cell-cell/cell-ECM adhesion and actomyosin activity in the surface contractility, and used the “J” term as it was conventional in the cellular Potts model.

      On the other hand, due to the parameters chosen for J_apical and J_basal in the pale blue cells, the apical membrane area will tend to shrink and the basal membrane will tend to enlarge. Because the lateral membrane energy J_lateral is constant among all cells (I think?), this will effectively drive cells to apically contract in the center.

      That expectation was an initial motivation of our study, but we found that the differential J alone could not drive the cells to apically contract in the center.

      I agree that extra clarification by the authors would be very helpful here.

      Reviewer #2:

      Regarding the patterned contractility: indeed, I missed this point (the pale blue region is really poorly visible).

      Nevertheless, it seems that contractility in the authors' model changes in a step-like fashion.

      [...] There may be important differences between furrowing under step-like patterning profile versus smooth "bell-like" patterning (see Supplementary Figure 13 in Rauzi et al. Nat Commun 2015). In particular, in the case of a step-like patterning, [there are] constrictions of side cells (similar to what the authors in this manuscript report), whereas in the bell-like patterning, [...] such side constrictions [do not occur].

      As replied to the reviewer #2 comment (2), we added the simulations with gradient-pattern contractility.

    1. eLife assessment

      This important study presents the structure of human heparan-alpha-glucosaminide N-acetyltransferase (HGSNAT) in the acetyl-CoA bound state, providing the first description of the architecture of this family of integral membrane enzymes, and revealing the mode of acetyl-CoA binding. The structural work is convincing, with a high resolution and isotropic single-particle cryoEM map and an atomic model that is well-justified by the density map, with strong density for the acetyl-CoA ligand. However, experimental support for the molecular mechanism of the HS acetylation reaction and the impact of disease-causing mutations is incomplete. This work will be of interest to biochemists and structural biologists studying the structure and function of integral membrane enzymes, as well as those interested in genetic diseases resulting from mutations in this family of enzymes, such as mucopolysaccharidosis IIIC (MPS III-C).

    1. Author response:

      The following is the authors’ response to the original reviews.

      We extend our sincere gratitude to the reviewers for their constructive feedback and valuable suggestions, which have significantly contributed to enhancing the quality of our work. In response to the comments, we have meticulously revised our manuscript with the following updates:

      (1) New Data Inclusion: We have incorporated new immunofluorescent staining images, FACS analysis of monocytes, and single-cell RNA sequencing (scRNAseq) expression analysis focusing on genes related to IFNGR, as well as T cell memory subsets (Trm, Tcm, and Tem).

      (2) Comparative Analysis: We have conducted a comparative analysis between the active vitiligo dFBs and the ACD pAd (r5) identified in our study, which provides further insight into the immune response mechanisms.

      (3) Discussion Expansion: We have expanded the discussion to include the role of tissue-resident memory (Trm) T cells in our model and have addressed the limitations of our animal model and in vitro studies.

      (4) Supplemental Material: As requested by the reviewers, we have provided four new supplemental tables (Table S2 ~ S5) and specific information for antibodies used in our study.

      Please see our Point-to-Point Responses to Reviewers' comments below:

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Liu et al. used scRNA-seq to characterize cell type-specific responses during allergic contact dermatitis (ACD) in a mouse model, specifically the hapten-induced DNFB model. Using the scRNA-seq data, they deconvolved the cell types responsible for the expression of major inflammatory cytokines such as IFNG (from CD4 and CD8 T cells), IL4/13 (from basophils), IL17A (from gd T cells), and IL1B from neutrophils and macrophages. They found the highest upregulation of a type 1 inflammatory response, centering around IFNG produced by CD4 and CD8 T cells. They further identified a subpopulation of dermal fibroblasts that upregulate CXCL9/10 during ACD and provided functional genetic evidence in their mouse model that disrupting IFNG signaling to fibroblasts decreases CD8 T cell infiltration and overall inflammation. They identify an increase in IFNG-expressing CD8 T cells in human patient samples of ACD vs. healthy control skin and co-localization of CD8 T cells with PDGFRA+ fibroblasts, which suggests this mechanism is relevant to human ACD. This mechanism is reminiscent of recent work (Xu et al., Nature 2022) showing that IFNG signaling in dermal fibroblasts upregulates CXCL9/10 to recruit CD8 T cells in a mouse model of vitiligo. Overall, this is a very wellpresented, clear, and comprehensive manuscript. The conclusions of the study are mostly well supported by data, but some aspects of the work could be improved by additional clarification of the identity of the cell types shown to be involved, including the exact subpopulation discovered by scRNA-seq and the subtype of CD8 T cell involved. The study was limited by its use of one ACD model (DNFB), which prevents an assessment of how broadly relevant this axis is. The human sample validation is slightly circumstantial and limited by the multiplexing capacity of immunofluorescence markers.

      Strengths:

      Through deep characterization of the in vivo ACD model, the authors were able to determine which cell types were expressing the major cytokines involved in ACD inflammation, such as IFNG, IL4/13, IL17A, and IL1B. These analyses are well-presented and thoughtful, showing first that the response is IFNG-dominant, then focusing on deeper characterization of lymphocytes, myeloid cells, and fibroblasts, which are also validated and complemented by FACS experiments using canonical markers of these cell types as well as IF staining. Crosstalk analyses from the scRNA-seq data led the authors to focus on IFNG signaling fibroblasts, and in vitro experiments demonstrate that CXCL9 and CXCL10 are expressed by fibroblasts stimulated by IFNG. In vivo functional genetic evidence demonstrates an important role for IFNG signaling in fibroblasts, as KO of Ifngr1 using Pdgfra-Cre Ifngr1 fl/fl mice, showed a reduction in inflammation and CD8 T cell recruitment.

      Weaknesses:

      (1) The use of one model limits an understanding of how broad this fibroblast-T cell axis is during ACD. However, the authors chose the most commonly employed model and cited additional work in a vitiligo model (another type 1 immune response).

      We thanks the reviewer for pointing out this limitation. Although the DNFB-elicited ACD model is the most commonly used animal model for ACD, our study is limited by the use of only one type 1 immune response model. We have now added new data (Figure 5-figure supplement 1A) showing that the active ACD pAd (r5) and the active IFNγ-responsive vitiligo dFBs (Xu et al., 2022) are enriched with a highly similar panel of IFNγ-inducible genes. Future studies are still needed to determine whether this fibroblast-T cell axis may be broadly applied to other ACD models or to other type-1 immune response-related inflammatory skin diseases.

      (2) The identity of the involved fibroblasts and T cells in the mouse model is difficult to assess as scRNA-seq identified subpopulations of these cell types, but most work in the Pdgfra-Cre Ifngr1 fl/fl mice used broad markers for these cell types as opposed to matched subpopulation markers from their scRNA-seq data.

      Thanks for the reviewer's constructive comments. To better showcase the dWAT layer where PDGFRA+ pAds are enriched, we have included new histological staining and PLIN1 (adipocyte marker) in new Figure 4 - figure supplement 1F-G. As shown in Figure 4 - figure supplement 1G, the PLIN1+ dWAT layer is located in the lower dermis right above the cartilage layer.  In Figure 4-figure supplement 1I and J, we have shown that phosphor-STAT1 (pSTAT1), a key signaling molecule activated by IFNγ, was detected primarily in PDGFRA+Ly6A+ pAds in the lower dermis where dWAT is located. In addition, we have now included new data showing that the pAd (dFB_r5) cluster preferentially expressed the highest levels of both Ifngr1 and Ifngfr2 among all dFB subclusters (new Figure 5 - figure supplement 1B). Furthermore, we have included new co-staining data showing that CXCL9 largely co-localized with ICAM1(new Figure 4 - figure supplement 1K), a marker for committed pAds (Merrick et al., 2019), in the reticular dermis and dWAT region of the ACD skin, further confirming that CXCL9 is specifically induced in the pAd subset of dFBs. Additionally, we included new staining data showing that ACD-mediated induction of CXCL9 in ICAM1+ dFBs were largely suppressed upon targeted deletion of Ifngr1 in Pdgfra+ dFBs (new Figure 6 - figure supplement 1D-E).

      (3) Human patient samples of ACD were co-stained with two markers at a time, demonstrating the presence of CD8+IFNG+ T cells, PDGFRA+CXCL10+ fibroblasts, and co-localization of PDGFRA+ fibroblasts and CD8+ T cells. However, no IF staining demonstrates co-expression of all 4 markers at once; thus, the human validation of co-localization of CD8+IFNG+ T cells and PDGFRA+CXCL10+ fibroblasts is ultimately indirect, although not a huge leap of faith. Although n=3 samples of healthy control and ACD samples are used, there is no quantification of any results to demonstrate the robustness of differences.

      Thanks for the reviewer’s constructive comments. We have shown that PDGFRA colocalizes with CXCL10, in the dermal micro-vascular structures, where CD8+ T cells infiltrate around PDGFRA+ dFBs. We are sorry that due to technical issues (antibody compatibility), we cannot provide the four color co-staining as suggested by the reviewers. In order to demonstrate the robustness and reproducibility of the staining presented, we have now supplemented 4 independent images for both Fig. 7A and Fig. 7E in the updated Figure 7-figure supplement 1A-B.

      Reviewer #2 (Public Review):

      Summary:

      The investigators apply scRNA seq and bioinformatics to identify biomarkers associated with DNFB-induced contact dermatitis in mice. The bioinformatics component of the study appears reasonable and may provide new insights regarding TH1-driven immune reactions in ACD in mice. However, the IF data and images of tissue sections are not clear and should be improved to validate the model.

      Strengths:

      The bioinformatics analysis.

      Weaknesses:

      The IF data presented in 4H, 6H, 7E and 7F are not convincing and need to be correlated with routine staining on histology and different IF markers for PDGFR. Some of the IF staining data demonstrates a pattern inconsistent with its target.

      We are sorry for the confusion, because 4H and 6H are staining on mouse skin sections, and 7E and 7F are staining on human skin sections, therefore the patterns of PDGFRA+ dFBs appeared inconsistent between species. As shown in Fig. 4H, in mouse skin, PDGFRA+CXCL9/10+ dFBs are located between the lower reticular dermis and dWAT region, where preadipocytes are located (Sun et al., 2023). To better showcase the dWAT layer where PDGFRA+ pAds are enriched, we have included new histological staining and PLIN1 (adipocyte marker) in new Figure 4 - figure supplement 1F-G. As shown in Figure 4 - figure supplement 1G, the PLIN1+ dWAT layer is located in the lower dermis right above the cartilage layer. Furthermore, we have included new co-staining data showing that CXCL9 largely co-localized with ICAM1(new Figure 4 - figure supplement 1K), a marker for committed pAds (Merrick et al., 2019), in the reticular dermis and dWAT region of the ACD skin, further confirming that CXCL9 is specifically induced in the pAd subset of dFBs.   

      As shown in Fig. 7E, in human skin, PDGFRA+CXCL10+ dFBs are located within the microvascular structures located at the dermal-epidermal junction (DEJ) region, where mesenchymal stem cells are enriched (Russell-Goldman & Murphy, 2020). We have included the corresponding HE histological staining image for Fig. 4H in new Figure 4-supplement 1F. Histological staining for Fig. 6H is the HE staining image in Fig. 6F. The histological staining for Fig. 7E and 7F is shown by Masson’s trichrome staining shown in Fig. 7C (a three-colour histological staining).

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Major comments:

      (1) While the focus on fibroblast and T cell interactions and overall biological findings regarding these interactions (IFNG - CXCL9/10 - CXCR3) is sound, it is slightly confusing about which exact subpopulations of these cells are involved in ACD pathogenesis as both scRNA-seq and IF are used but very broad markers are used for IF. Regarding fibroblasts, the scRNA-seq identifies the pAd (r5) cluster of fibroblasts as the main producer of CXCL9/10. However, the expression of IFNGR1 was not shown for this subpopulation as well as for other fibroblast subpopulations. Figure 6C shows IFNGR1 staining in the Ifngr1 fl/fl control mice which appears quite broad. With the seemingly broad expression of IFNGR1, why is it that only a subpopulation of fibroblasts upregulate CXCL9/10? Is there a specific location of these pAd fibroblasts that help drive this IFNG response? Please show the expression of Ifngr1 in the fibroblast scRNA-seq data.

      Thanks for the reviewer’s constructive comments. We have now included new data showing that the pAd (dFB_r5) cluster preferentially expressed higher levels of both Ifngr1 and Ifngfr2 among all dFB subclusters (new Figure 5 - figure supplement 1B). In addition, we included new co-staining data showing that CXCL9 largely co-localized with ICAM1, a marker for committed pAds (Merrick et al., 2019), in the reticular dermis and dWAT region of the ACD skin, further confirming that CXCL9 is specifically induced in the pAd subset of dFBs.

      (2) Regarding T cells, it is slightly confusing regarding what role the fibroblast-produced CXCL9/10 plays on T cell migration vs. activation. This is mainly because in vitro work focuses on T cell activation, while in vivo work seems to mainly assess T cell migration into the tissue. The in vivo studies have nicely shown that CD8 T cells are the main cell type affected by Ifngr1 iKO (i.e., a reduction of these cells), but T cell activity in vivo is not assessed (in the form of IFNG production). I have the following related questions:

      a. Authors do not discuss whether T cells involved in ACD in their model are tissue-resident memory T cells (Trm) or whether these are recruited from circulation. This may be possible to assess via additional analysis of the scRNA-seq data (looking for expression of Trm markers). 

      Thanks for the reviewer’s constructive comments. We have now included new data showing the expression of marker genes of various memory T cells in various T cell subclusters (new Figure 2 - figure supplement 1C-D). Antigen-specific CD8 or CD4 memory T cells can be classified into CD62hi/CCR7hi/CD28hi/CD27hi/CX3CR1lo central memory T cells (Tcm), CX3CR1hi/Cd28hi/Cd27lo/CD62lo/CCR7lo effector memory T cells (Tem), and CD49ahi/CD103hi/ CD69hi/BLIMP1hi tissue-resident memory T cells (Trm) (Benichou, Gonzalez, Marino, Ayasoufi, & Valujskikh, 2017; Cheon, Son, & Sun, 2023; Mackay et al., 2013; Martin & Badovinac, 2018; Park et al., 2023). We observed that in ACD skin, CD4+ and CD8+ T cells predominantly expressed marker genes associated with Tcm including Cd28, Cd27, Ccr7, and S1pr1/Cd62l. In contrast, marker genes associated with Tem (Cx3cr1) and Trm (Itga1/Cd49a, Itgae/Cd103, Cd69 and Prdm1/Blimp1, Cd127/Il7r) were only scarcely expressed in these αβ T cells, suggesting that ACD predominantly triggers a central memory T cell response in the skin.

      Furthermore, this hypothesis is supported by new lymph node gene expression results. We showed that the expression of Ifng, but not Il4 or Il17a, was rapidly induced in skin draining lymph nodes at 24 hours after ACD elicitation (new Figure 1-figure supplement 1H). This suggests a robust and systemic activation of type 1 memory T cell response in the early stage of ACD, and the migration of these lymphatic memory T cells to the skin may contribute to the exacerbation of skin inflammation.

      b. Authors have focused on CXCR3 axis involvement in IFNG production (Figures 5G-H) without assessing the presumed migratory role of this axis. Presumably, CD8 T cells are recruited to the skin via the CXCL9/10-CXCR3 axis, but this would be important to clarify given other work that has demonstrated Trm involvement in ACD. Authors should at least discuss how their model and findings support, refine, or even contradict the current paradigm of Trm involvement in ACD (Lefevre et al., 2021; PMID: 34155157).

      We are grateful for the constructive feedback provided by the reviewer. CXCR3 is a chemokine receptor on T cells and not only plays a pivotal role in the trafficking of type 1 T cells, but also is required for optimal generation of IFNG-secreting type 1 T cells in vivo (Groom et al., 2012). Our in vitro study is limited by only focusing on CXCL9/10-CXCR3 axis involvement in IFNγ production without studying its role in driving T cell migration. We have now addressed this limitation in the discussion section.

      In the murine model of ACD, the initial sensitization phase involves exposing mouse skin to a high dose of DNFB to prime effector T cells in lymphoid organs, and this is followed by a later challenge/elicitation phase, during which the mice are re-exposed to a lower dose of DNFB in a different area of the skin, distal from the original sensitization site (Manresa, 2021; Vocanson, Hennino, Rozieres, Poyet, & Nicolas, 2009). Our updated analysis of the expression of marker genes associated with central memory T cells (Tcm), effector memory T cells (Tem), and tissue-resident memory T cells (Trm), as presented in the revised Figure 2-figure supplement 1C-D, indicates that indicate that the type-1 inflammation observed upon ACD elicitation is predominantly driven by memory T cells recruited from lymphoid organs, rather than by skin resident memory T cells. We have read the reference provided by the reviewer along with a few other related studies indicating that Trm is involved in ACD. We found that these studies performed the elicitation phase on the same skin area where the initial sensitization is conducted, and only when it results in a rapid allergen-induced skin inflammatory response, that is primarily mediated by IL17A-producing and IFNγ-producing CD8+ skin resident memory T cells (Gadsboll et al., 2020; Murata & Hayashi, 2020; Schmidt et al., 2017; Wongchang et al., 2023). These studies suggest that Trm cells establish a long-lasting local memory during the initial sensitization, and upon re-exposure to the hapten in the same skin area, these site-specific Trm cells can rapidly contribute to a robust type-1 skin inflammatory response. Therefore, a robust involvement of Trm in ACD requires a repeated exposure of the same hapten to the same skin area. We have now added related discussion in the discussion section.

      c. While it may be difficult to assess given reduced numbers of CD8 T cells in the Ifngr1 iKO, is the CXCL9/10-CXCR3 axis affecting IFNG production by T cells in vivo?

      Yes, we have shown in Fig. 6G that ACD-mediated induction of Ifng was significantly suppressed in the Ifngr1-iKO mice compared to the control mice.

      (3) The authors cite prior work (Xu et al. Nature 2022) that demonstrated a similar mechanism for fibroblasts in recruiting vitiligo-inducing T cells. Are the pAd (r5) cluster of fibroblasts similar to the fibroblast subpopulation that drives vitiligo?

      The study on mouse model of vitiligo (Xu et al. Nature 2022) did not perform single-cell RNAseq of the vitiligo mouse skin. Instead, they conducted RNAseq analysis on the sorted PDGFRA+ dFBs. Therefore, we cannot directly compare our pAd (r5) cluster with the fibroblast subpopulation that drives vitiligo. Nevertheless, by utilizing a Venn diagram to compare the top 100 lFNγ signaling dependent genes upregulated in the active vitiligo mouse dFBs and the top 100 genes enriched in our ACD pAd (dFB_r5) cells, we identified 29 commonly upregulated genes between the two conditions (Figure 5-figure supplement 1A). Furthermore, all these 29 genes were among the top IFNγ-inducible genes in primary dFBs. These shared genes include CXCL9, CXCL10, and several other downstream targets of IFNγ signaling, such as B2M, BST2, CD274, as well as the GBP family members GBP3, GBP4, GBP5, GBP7, and additional genes like H2-K1, H2-Q4, H2-Q7, H2-T23, IFIT3, ISG15, and STAT1. This result suggests that the pAd (dFB_r5) cells possess a common IFNγ-pathway gene signature with the active vitiligo mouse dFBs, indicating a potential overlap in molecular pathways.

      (4) The authors should include bulk RNA-seq data from fibroblast stimulation (Figure 5b) at a minimum in the GEO submission. They should ideally include the differentially expressed genes in a supplementary table.

      Thanks for the reviewer’s constructive comments. We have now included the raw FPKM file for the bulk RNAseq data shown in Fig. 5 in Supplemental Table S3, and the list for differentially expressed genes in Supplemental Table S4.

      (5) The authors state that human sample stainings were n = 3 per group for healthy control and ACD (Figure 7), but no quantification or statistical testing is provided to demonstrate significant differences in findings such as co-localization of fibroblasts and T cells, IFNG+CD8+ T cells, etc.

      Thanks for the reviewer’s constructive comments. We have now supplemented 4 independent images for both Fig. 7A and Fig. 7E in the new Figure 7-figure supplement 1A-B to demonstrate the robustness and reproducibility of the staining presented.

      Minor comments:

      (1) Figure 1G, possible typos, Il14 and Il11b are on the violin plots when I believe authors meant Il4 and Il1b.

      Thank a lot for pointing out these typos. We have now made the correction in the updated manuscript figure 1.

      (2) The authors label cluster 27 as neutrophils based on the expression of Ly6g and S100a8. These markers are also expressed by Cd14+ inflammatory monocytes. I believe the authors need to additionally validate that these cells are neutrophils (via staining or additional analyses). Neutrophils are notoriously difficult to capture in scRNA-seq given low RNA content. Later, they are quantified by FACS using CD11b+Ly6G+ markers, but I do not believe this would distinguish them from CD14+ monocytes. As this is a relatively minor aspect of the manuscript, I consider this a minor concern, but a finding that should be as accurate as possible as Il1b is likely important, and identifying its accurate source likewise.

      Thanks a lot for reviewer’s constructive comments. According to the reviewer’s suggestion, we have now added Cd14 expression in Figure 1C, and found that indeed cluster 27 express not only expressed Ly6G but also expressed Cd14. Based on literatures, the expression of Ly6G in circulating blood, spleen, and peripheral tissues is limited to neutrophils, whereas monocytes, macrophages, and lymphocytes are negative of Ly6G (Ikeda et al., 2023; Lee, Wang, Parisini, Dascher, & Nigrovic, 2013). Therefore, Ly6G can be used as a marker to distinguish neutrophils and monocytes. Although CD14 is highly expressed in monocytes, neutrophils can also express CD14 at lower level (Antal-Szalmas, Strijp, Weersink, Verhoef, & Van Kessel, 1997). Therefore, the cluster 27 is likely a mixed population of neutrophils and monocytes. So we have changed the definition of this cluster as NEU/Mon in the updated manuscript.

      To confirm the presence of neutrophils and monocytes in ACD, we have included new FACS analysis of inflammatory monocytes, which are gated as CD11B+Ly6G-F4/80-CD11C-Ly6Chi, according to published FACS protocol(Rose, Misharin, & Perlman, 2012). We found that elicitation of ACD led to a transient influx of monocytes at 24 hrs post treatment, whereas the percentage of neutrophils continued to increase by 60 hours post-treatment (Figure 3L, and Figure 3-figure supplement 1G). In addition, at 60 hrs, the percentage of neutrophils (~5%) was > 10 times greater than the percentage of monocytes (~0.4%), indicating that neutrophils are the dominant granulocytes at 60 hours post ACD elicitation.

      (3) The authors should include a cluster marker table as a supplementary file to accompany Figure 1C. Only top cluster markers are shown in 1C.

      Thanks a lot for reviewer’s constructive comments. We have now included the top 5 enriched genes in each cell clusters shown in Fig. 1C in supplementary Table S2.

      (4) Figures 2A/B have mismatched labels. There is a gdT/ILC2 label in the 2B, but not in 2A. Please match these. Along these lines, which gdT cluster is the IL17A expressing cluster as shown in 1D? Matching these labels will clarify which population is doing what.

      Thanks a lot for reviewer to point out this mistake. To avoid confusion about the T cell clusters, we have added a specific recluster# for the T cell clusters as r0~r7 (Figure 2A-B). The r4 cluster is a mixed population of δγT and ILC2, therefore termed as δγT/ILC2. As shown in Figure 2-figure supplement 1E, IL17A is primarily expressed in the δγT cell (r5). We have now corrected δγT2 to δγT/ILC2 throughout the manuscript. To avoid confusion, we have now added cluster # in updated Figure 2D.

      (5) In Figure 3E, the authors used CD11B as a distinguishing marker for basophils (CD11B+) vs. mast cells (CD11B-). Mcpt8 is a better distinguishing marker, so I am wondering why the authors chose CD11B.

      Thanks a lot for reviewer’s comments. In scRNAseq, we did use Mcpt8 as a basophil specific marker to distinguish basophils and mast cells (see Figure 1C). However, Mcpt8 is not a surface receptor that can be used in FACS analysis. Therefore, to distinguish basophils from mast cells by FACS, we have to choose surface markers expressed on these cells. FcεR1a is a highly specific markers expressed exclusively on basophils and mast cells, and CD11B is expressed on basophils but not in mature mast cells (Hamey et al., 2021). As a result, FACS analysis of the surface expression of CD11B and FceR1a can distinguish basophils (CD11B+ FcεR1a+) from mast cells (CD11B- FcεR1a+). The use of CD11B and FcεR1a to distinguish basophils and mast cells can also been see in a published reference study (Arinobu et al., 2005).

      (6) Antibody information is missing for IF studies. No clones, catalog numbers, vendors, RRIDs, or dilutions are included in the Methods section for any of the IF data.

      Thanks a lot for reviewer’s constructive comments. We have now added related information for all the antibodies we used for FACS or IF data in the method section.

      (7) Figure 3 supplement E and F appear to be reversed based on legend descriptions.

      Thank a lot for pointing this out. We have now made the correction in the updated Supplementary file.

      References:

      Antal-Szalmas, P., Strijp, J. A., Weersink, A. J., Verhoef, J., & Van Kessel, K. P. (1997). Quantitation of surface CD14 on human monocytes and neutrophils. J Leukoc Biol, 61(6), 721-728. doi:10.1002/jlb.61.6.721

      Arinobu, Y., Iwasaki, H., Gurish, M. F., Mizuno, S., Shigematsu, H., Ozawa, H., . . . Akashi, K. (2005). Developmental checkpoints of the basophil/mast cell lineages in adult murine hematopoiesis. Proc Natl Acad Sci U S A, 102(50), 18105-18110. doi:10.1073/pnas.0509148102

      Benichou, G., Gonzalez, B., Marino, J., Ayasoufi, K., & Valujskikh, A. (2017). Role of Memory T Cells in Allograft Rejection and Tolerance. Front Immunol, 8, 170. doi:10.3389/fimmu.2017.00170

      Cheon, I. S., Son, Y. M., & Sun, J. (2023). Tissue-resident memory T cells and lung immunopathology. Immunol Rev, 316(1), 63-83. doi:10.1111/imr.13201

      Gadsboll, A. O., Jee, M. H., Funch, A. B., Alhede, M., Mraz, V., Weber, J. F., . . . Bonefeld, C. M. (2020). Pathogenic CD8(+) Epidermis-Resident Memory T Cells Displace Dendritic Epidermal T Cells in Allergic Dermatitis. J Invest Dermatol, 140(4), 806-815 e805. doi:10.1016/j.jid.2019.07.722

      Groom, J. R., Richmond, J., Murooka, T. T., Sorensen, E. W., Sung, J. H., Bankert, K., . . . Luster, A. D. (2012). CXCR3 chemokine receptor-ligand interactions in the lymph node optimize CD4+ T helper 1 cell differentiation. Immunity, 37(6), 1091-1103. doi:10.1016/j.immuni.2012.08.016

      Hamey, F. K., Lau, W. W. Y., Kucinski, I., Wang, X., Diamanti, E., Wilson, N. K., . . . Dahlin, J. S. (2021). Single-cell molecular profiling provides a high-resolution map of basophil and mast cell development. Allergy, 76(6), 1731-1742. doi:10.1111/all.14633

      Ikeda, N., Kubota, H., Suzuki, R., Morita, M., Yoshimura, A., Osada, Y., . . . Asano, K. (2023). The early neutrophil-committed progenitors aberrantly differentiate into immunoregulatory monocytes during emergency myelopoiesis. Cell Rep, 42(3), 112165. doi:10.1016/j.celrep.2023.112165

      Lee, P. Y., Wang, J. X., Parisini, E., Dascher, C. C., & Nigrovic, P. A. (2013). Ly6 family proteins in neutrophil biology. J Leukoc Biol, 94(4), 585-594. doi:10.1189/jlb.0113014

      Mackay, L. K., Rahimpour, A., Ma, J. Z., Collins, N., Stock, A. T., Hafon, M. L., . . . Gebhardt, T. (2013). The developmental pathway for CD103(+)CD8+ tissue-resident memory T cells of skin. Nat Immunol, 14(12), 1294-1301. doi:10.1038/ni.2744

      Manresa, M. C. (2021). Animal Models of Contact Dermatitis: 2,4-Dinitrofluorobenzene-Induced Contact Hypersensitivity. Methods Mol Biol, 2223, 87-100. doi:10.1007/978-1-0716-1001-5_7

      Martin, M. D., & Badovinac, V. P. (2018). Defining Memory CD8 T Cell. Front Immunol, 9, 2692. doi:10.3389/fimmu.2018.02692

      Merrick, D., Sakers, A., Irgebay, Z., Okada, C., Calvert, C., Morley, M. P., . . . Seale, P. (2019). Identification of a mesenchymal progenitor cell hierarchy in adipose tissue. Science, 364(6438). doi:10.1126/science.aav2501

      Murata, A., & Hayashi, S. I. (2020). CD4(+) Resident Memory T Cells Mediate Long-Term Local Skin Immune Memory of Contact Hypersensitivity in BALB/c Mice. Front Immunol, 11, 775. doi:10.3389/fimmu.2020.00775

      Park, S. L., Christo, S. N., Wells, A. C., Gandolfo, L. C., Zaid, A., Alexandre, Y. O., . . . Mackay, L. K. (2023). Divergent molecular networks program functionally distinct CD8(+) skin-resident memory T cells. Science, 382(6674), 1073-1079. doi:10.1126/science.adi8885

      Rose, S., Misharin, A., & Perlman, H. (2012). A novel Ly6C/Ly6G-based strategy to analyze the mouse splenic myeloid compartment. Cytometry A, 81(4), 343-350. doi:10.1002/cyto.a.22012

      Russell-Goldman, E., & Murphy, G. F. (2020). The Pathobiology of Skin Aging: New Insights into an Old Dilemma. Am J Pathol, 190(7), 1356-1369. doi:10.1016/j.ajpath.2020.03.007

      Schmidt, J. D., Ahlstrom, M. G., Johansen, J. D., Dyring-Andersen, B., Agerbeck, C., Nielsen, M. M., . . . Bonefeld, C. M. (2017). Rapid allergen-induced interleukin-17 and interferon-gamma secretion by skin-resident memory CD8(+) T cells. Contact Dermatitis, 76(4), 218-227. doi:10.1111/cod.12715

      Sun, L., Zhang, X., Wu, S., Liu, Y., Guerrero-Juarez, C. F., Liu, W., . . . Zhang, L. J. (2023). Dynamic interplay between IL-1 and WNT pathways in regulating dermal adipocyte lineage cells during skin development and wound regeneration. Cell Rep, 42(6), 112647. doi:10.1016/j.celrep.2023.112647

      Vocanson, M., Hennino, A., Rozieres, A., Poyet, G., & Nicolas, J. F. (2009). Effector and regulatory mechanisms in allergic contact dermatitis. Allergy, 64(12), 1699-1714. doi:10.1111/j.1398-9995.2009.02082.x

      Wongchang, T., Pluangnooch, P., Hongeng, S., Wongkajornsilp, A., Thumkeo, D., & Soontrapa, K. (2023). Inhibition of DYRK1B suppresses inflammation in allergic contact dermatitis model and Th1/Th17 immune response. Sci Rep, 13(1), 7058. doi:10.1038/s41598-023-34211-x

      Xu, Z., Chen, D., Hu, Y., Jiang, K., Huang, H., Du, Y., . . . Chen, T. (2022). Anatomically distinct fibroblast subsets determine skin autoimmune patterns. Nature, 601(7891), 118-124. doi:10.1038/s41586-021-04221-8

    2. eLife assessment

      This important study uses single-cell RNA-seq to obtain a more granular understanding of cell subsets within allergic contact dermatitis in a model system with DNFB. The convincing data revela unique subpopulations of dermal fibroblasts as key responders to interferon gamma and likely as mediators of dermatitis. This study has many novel aspects and provides a unique resource as well.

    3. Reviewer #1 (Public Review):

      In this manuscript, Liu et al. used scRNA-seq to characterize cell type-specific responses during allergic contact dermatitis (ACD) in a mouse model, specifically the hapten-induced DNFB model. Using the scRNA-seq data, they deconvolved the cell types responsible for the expression of major inflammatory cytokines such as IFNG (from CD4 and CD8 T cells), IL4/13 (from basophils), IL17A (from gd T cells), and IL1B from neutrophils and macrophages. They found the highest upregulation of a type 1 inflammatory response, centering around IFNG produced by CD4 and CD8 T cells. They further identified a subpopulation of dermal fibroblasts (pre-adipocytes found in the dermal white adipose tissue layer) that upregulate CXCL9/10 during ACD and provide functional genetic evidence in their mouse model that disrupting IFNG signaling in fibroblasts decreases CD8 T cell infiltration and overall inflammation. They identify an increase in IFNG-expressing CD8 T cells in human patient samples of ACD vs. healthy control skin and co-localization of CD8 T cells with PDGFRA+ fibroblasts, which suggests this mechanism is relevant to human ACD. This mechanism is reminiscent of recent work showing that IFNG signaling in dermal fibroblasts upregulates CXCL9/10 to recruit CD8 T cells in a mouse model of vitiligo. Overall, this is a well-presented, clear, and comprehensive manuscript. The conclusions of the study are well supported by the data, with thoughtful discussion on study limitations by the authors. One such limitation was the use of one ACD model (DNFB), which prevents an assessment of how broadly relevant this axis is. The human sample validation is limited by the multiplexing capacity of immunofluorescence markers but shows a predominance of CD8+/IFNG+ cells and PDGFRA+/CXCL10+ cells in ACD (which are virtually absent in healthy control), along with co-localization of CD8+ cells with PDGFRA+ cells. Thus, this mechanism is likely active in human ACD.

      Strengths:<br /> Through deep characterization of the in vivo ACD model using scRNA-seq, the authors were able to determine which cell types were expressing the major cytokines involved in ACD inflammation, such as IFNG, IL4/13, IL17A, and IL1B. These analyses are well-presented and thoughtful, showing first that the response is IFNG-dominant, then focusing on deeper characterization of lymphocytes, myeloid cells, and fibroblasts, which are also validated and complemented by FACS experiments using canonical markers of these cell types as well as IF staining. Crosstalk analyses from the scRNA-seq data led the authors to focus on IFNG signaling fibroblasts, and in vitro experiments demonstrate that CXCL9 and CXCL10 are expressed by fibroblasts stimulated by IFNG. In vivo functional genetic evidence demonstrates an important role for IFNG signaling in fibroblasts, as KO of Ifngr1 using Pdgfra-Cre Ifngr1 fl/fl mice, showed a reduction in inflammation and CD8 T cell recruitment. Human ACD sample staining demonstrates the likely activity of the CD8 T cell IFNG-driven fibroblast response in human disease.

      Weaknesses:<br /> The use of one model limits an understanding of how broad this fibroblast-T cell axis is during ACD. However, the authors chose the most commonly employed model and compared their data to work in a vitiligo model (another type 1 immune response) to demonstrate similar mechanisms at play. Human patient samples of ACD were co-stained with two markers at a time, demonstrating the presence of CD8+IFNG+ T cells, PDGFRA+CXCL10+ fibroblasts, and co-localization of PDGFRA+ fibroblasts and CD8+ T cells. However, no IF staining demonstrates co-expression of all 4 markers at once; thus, the human validation of co-localization of CD8+IFNG+ T cells and PDGFRA+CXCL10+ fibroblasts is ultimately indirect, although more likely than not to be true.

    4. Reviewer #2 (Public Review):

      Summary: The investigators apply scRNA seq and bioinformatics to identify biomarkers associated with the DNFB-induced contact dermatitis in mice. The bioinformatics component of the study appears reasonable and may provide new insights regarding TH1 driven immune reactions in ACD in mice. However, the IF data and images of tissue sections are not clear and should be improved to validate the model.

      Strengths:<br /> The bioinformatics analysis.

      Weaknesses:<br /> The IF data presented in 4H, 6H, 7E and 7F are not convincing and need to be correlated with routine staining on histology and different IF markers for PDGFR. Some of the IF staining data demonstrates a pattern inconsistent with its target.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Main points:

      (1) We have added data for fructose in Fig. 1

      (2) We have added sta1s1cs (red stars and NS) comparing Tp between fed and refed flies. 

      (3) We have modified the figure for each point to the opened small circles.

      (4) We have moved the data from Fig. S3 to Fig. 2 and 3.

      (5) We have added the schema1c diagrams depic1ng behavioral assay in Fig. S1.

      (6) We have added heatmaps for WT and Gr64f-Gal4>UAS-CsChrimson flies in Fig. S2.

      (7) We have added Orco1 mutant data in Fig. S4.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This paper presents valuable findings that gustation and feeding state influence the preferred environmental temperature preference in flies. Interestingly, the authors showed that by refeeding starved animals with the non-nutritive sugar sucralose, they are able to tune their preference towards a higher temperature in addition to nutrient-dependent warm preference. The authors show that temperature-sensing and sweet-sensing gustatory neurons (SGNs) are involved in the former but not the latter. In addition, their data indicate that pep3dergic signals involved in internal state and clock genes are required for taste-dependent warm preference behavior.

      The authors made an analogy of their results to the cephalic phase response (CPR) in mammals where the thought, sight, and taste of food prepare the animal for the consumption of food and nutrients. They further linked this behavior to core regulatory genes and peptides controlling hunger and sleep in flies having homologues in mammals. These valuable behavioral results can be further inves3gated in flies with the advantage of being able to dissect the neural circuitry underlying CPR and nutrient homeostasis.

      Strengths: 

      (1) The authors convincingly showed that tasting is sufficient to drive warm temperature preference behavior in starved flies and that it is independent of nutrient-driven warm preference. 

      (2) By using the genetic manipulation of key internal sensors and genes controlling internal feeding and sleep states such as DH44 neurons and the per genes for example, the authors linked gustation and temperature preference behavior control to the internal state of the animal. 

      Weaknesses: 

      (1) The title is somewhat misleading, as the term homeostatic temperature control linked to gustation only applies to starved flies. 

      We agree with the reviewer's suggestion and have changed the title to "Taste triggers a homeostatic temperature control in hungry flies".

      (2) The authors used a temperature preference assay and refeeding for 5 minutes, 10 minutes, and 1 hour.

      Experimentally, it makes a difference if the flies are tested immediately after 10 minutes or at the same 3me point as flies allowed to feed for 1 hour. Is 10 minutes enough to change the internal state in a nutrition-dependent manner? Some of the authors' data hint at it (e.g. refeeding with fly food for 10 minutes), but it might be relevant to feed for 5/10 minutes and wait for 55/50min to do the assays at comparable time points. 

      Thank you for your suggestions. The temperature preference behavioral test itself takes 30 minutes from the time the flies are placed in the apparatus until the final choice is made. This means that after the hungry flies have been refed for 5 minutes, they will determine their preferred temperature within 35 minutes. It has been shown that insulin levels peak at 10 minutes and gradually decline (Tsao, et al., PLoS Genetics 2023). However, it is unclear how subtle insulin levels affect behavior and how quickly the flies are able to consume food. These factors may contribute to temperature preference in flies. Therefore, to minimize "extraneous" effects, we decided to test the behavioral assay immediately after they had eaten the food. We have noted in the material and method section that why we chose the condition based on behavior duration and insulin effect. 

      (3) A figure depicting the temperature preference assay in Figure 1 would help illustrate the experimental approach. It is also not clear why Figure 1E is shown instead of full statistics on the individual panels shown above (the data is the same). 

      We have revised Figure 1A and added statistics in Figure 1BCD. We also added a figure depicting the temperature preference assay (Fig. S1).

      (4) The authors state that feeding rate and amount were not changed with sucralose and glucose. However, the FLIC assay they employed does not measure consumption, so this statement is not correct, and it is unclear if the intake of sucralose and glucose is indeed comparable. This limits some of the conclusions. 

      We agree and removed “amount” and have revised the MS. 

      (5) The authors make a distinction between taste-induced and nutrient-induced warm preference. Yet the statistics in most figures only show the significance between the starved and refed flies, not the fed controls. As the recovery is in many cases incomplete and used as a distinction of nutritive vs nonnutritive signals (see Figure 1E) it will be important to also show these additional statistics to allow conclusions about how complete the recovery is. 

      We agree with the comments and have revised the MS and figures. 

      (6) The starvation period used is ranging from 1 to 3 days, as in some cases no effect was seen upon 1 day of starvation (e.g. with clock genes or temperature sensing neurons). While the authors do provide a comparison between 18-21 and 26-29 hours old flies in Figure S1, a comparison for 42-49 and 66-69 hours of starvation is missing. This also limits the conclusion as the "state" of the animal is likely quite different after 1 day vs. 3 days of starvation and, as stated by the authors, many flies die under these conditions.  

      We mainly used 2 overnights of starvation.  Some flies (e.g. Ilp6 mutants) were completely healthy even after 2 overnights of starvation, we had to starve them for 3 overnights. For example, Ilp6 mutants needed 3 overnights of starvation to show a significant difference Tp between fed and starved flies. On the other hand, some flies (e.g. w1118 control flies) were very sick after 2 overnights of starvation, we had to starve them for one overnight. Therefore, the starvation conditions which we used for this manuscript are from 1- 3-overnights.

      First, we confirmed the starvation time by focusing on Tp which resulted in a sta1s1cally significant Tp difference between fed and starved flies; as men1oned above, flies prefer lower temperatures when starvation is prolonged (Umezaki et al., Current Biology 2018). Therefore, if Tp was not statistically different between fed and starved flies, we extended the starva1on 1me from 1 to 3 overnights. Importantly, we show in Fig. S3 that the dura1on of starvation did not affect the recovery effect. Furthermore, since control flies do not survive 42-49 or 66-69 hours of starvation, we can not test the reviewer's suggestion. We have carefully documented the conditions in the Material and method and figure legends.

      (7) In Figure 2, glucose-induced refeeding was not tested in Gr mutants or silenced animals, which would hint at post-ingestive recovery mechanisms related to nutritional intake. This is only shown later (in Figure S3) but I think it would be more fitting to address this point here. The data presented in Figure S3 regarding the taste-evoked vs nutrient-dependent warm preference is quite important while in some parts preliminary. It would nonetheless be justified to put this data in the main figures. However, some of the conclusions here are not fully supported, in part due to different and low n numbers, which due to the inherent variability of the behavior do not allow statistically sound conclusions. The authors claim that sweet GRNs are only involved in taste-induced warm preference, however, glucose is also nutritive but, in several cases, does not rescue warm preference at all upon removal of GRN function (see Figures S3A-C). This indicates that the Gal4 lines and also the involved GRs are potentially expressed in tissues/neurons required for internal nutrient sensing. 

      Thank you for your suggestion. We have added Figure S3ABC (glucose refeeding using Gr mutants and silenced animals) to Figure 2. There is no low N number since we tested > 5 times, i.e. >100 flies were tested. Tp may have a variation probably due to the effect of starvation on their temperature preference. 

      We did not mention that "The authors claim that sweet GRNs are only involved in taste-induced warm preference...". However, our wri1ng may not be clear enough. We agree that "...GRs may be expressed in tissues/neurons required for internal nutrient sensing. ..."  We have rewritten and revised the section.  

      (8) In Figure 4, fly food and glucose refeeding do not fully recover temperature preference after refeeding. With the statistical comparison to the fed control missing, this result is not consistent with the statement made in line 252. I feel this is an important point to distinguish between state-dependent and taste/nutrition-dependent changes.  

      We inserted the statistics and compared between Fed and other conditions. 

      (9) The conclusion that clock genes are required for taste-evoked warm preference is limited by the observation that they ingest less sucralose. In addition, the FLIC assay does not allow conclusions about the feeding amount, only the number of food interactions. Therefore, I think these results do not allow clear-cut conclusions about the impact of clock genes in this assay.  

      We agree and remove “amount” and have revised the MS. The per01 mutants ate (touched) sucralose more often than glucose. On the other hand, 1m01 mutants ate glucose more often than sucralose (Figure S6BC). However, these mutants s1ll showed a similar TP pattern for sucralose and glucose refeeding (Fig. 5CD). The results suggest that the 1m01 flies eat enough amount of sucralose over glucose that their food intake does not affect the TP behavioral phenotype. We have rewritten and revised the section.

      (10) CPR is known to be influenced by taste, thought, smell, and sight of food. As the discussion focused extensively on the CPR link to flies it would be interesting to find out whether the smell and sight of food also influence temperature preference behavior in animals with different feeding states.  

      We have added the data using Olfactory receptor co-receptor (Orco1) mutant, which lack olfaction, in Fig. S4. They failed to show the taste-evoked warm preference, but exhibited the nutrient-induced warm preference. Therefore, the data suggest that olfactory detection is also involved in taste-evoked warm preference. On the other hand, "seeing food" is probably more complicated, since light dramatically affects temperature preference behavior and the circadian clock that regulates temperature preference rhythms. Therefore, it will not be unlikely to draw a solid conclusion from the short set of experiments. We will address this issue in the next study.

      (11) In the discussion in line 410ff the authors claim that "internal state is more likely to be associated with taste-evoked warm preference than nutrient-induced warm preference." This statement is not clear to me, as neuropeptides are involved in mediating internal state signals, both in the brain itself as well as from gut to brain. Thus, neuropeptidergic signals are also involved in nutrient-dependent state changes, the authors might just not have identified the peptides involved here. The global and developmental removal of these signals also limits the conclusions that can be drawn from the experiments, as many of these signals affect different states, circuits, and developmental progression.  

      We agree with the comments. We have removed the sentences and revised the MS.  

      Reviewer #2 (Public Review): 

      Animals constantly adjust their behavior and physiology based on internal states. Hungry animals, desperate for food, exhibit physiological changes immediately upon sensing, smelling, or chewing food, known as the cephalic phase response (CPR), involving processes like increased saliva and gastrointestinal secretions. While starvation lowers body temperature, the mechanisms underlying how the sensation of food without nutrients induces behavioral responses remain unclear. Hunger stress induces changes in both behavior and physiological responses, which in flies (or at least in Drosophila melanogaster) leads to a preference for lower temperatures, analogous to the hunger-driven lower body temperature observed in mammals. In this manuscript, the authors have used Drosophila melanogaster to investigate the issue of whether taste cues can robustly trigger behavioral recovery of temperature preference in starving animals. The authors find that food detection triggers a warm preference in flies. Starved flies recover their temperature preference after food intake, with a distinction between partial and full recovery based on the duration of refeeding. Sucralose, an artificial sweetener, induces a warm preference, suggesting the importance of food-sensing cues. The paper compares the effects of sucralose and glucose refeeding, indicating that both taste cues and nutrients contribute to temperature preference recovery. The authors show that sweet gustatory receptors (Grs) and sweet GRNs (Gustatory Receptor Neurons) play a crucial role in taste-evoked warm preference. Optogenetic experiments with CsChrimson support the idea that the excitation of sweet GRNs leads to a warm preference. The authors then examine the internal state's influence on taste-evoked warm preference, focusing on neuropeptide F (NPF) and small neuropeptide F (sNPF), analogous to mammalian neuropeptide Y. Mutations in NPF and sNPF result in a failure to exhibit taste-evoked warm preference, emphasizing their role in this process. However, these neuropeptides appear not to be critical for nutrient-induced warm preference, as indicated by increased temperature preference during glucose and fly food refeeding in mutant flies. The authors also explore the role of hunger-related factors in regula3ng taste-evoked warm preference. Hunger signals, including diuretic hormone (DH44) and adipokinetic hormone (AKH) neurons, are found to be essential for taste-evoked warm preference but not for nutrient-induced warm preference. Additionally, insulin-like peptides 6 (Ilp6) and Unpaired3 (Upd3), related to nutritional stress, are identified as crucial for taste-evoked warm preference. The investigation then extends into circadian rhythms, revealing that taste-evoked warm preference does not align with the feeding rhythm. While flies exhibit a rhythmic feeding pattern, taste-evoked warm preference occurs consistently, suggesting a lack of parallel coordination. Clock genes, crucial for circadian rhythms, are found to be necessary for taste-evoked warm preference but not for nutrient-induced warm preference. 

      Strengths: 

      A well-written and interesting study, investigating an intriguing issue. The claims, none of which to the best of my knowledge controversial, are backed by a substantial number of experiments. 

      Weakness: 

      The experimental setup used and the procedures for assessing the temperature preferences of flies are rather sparingly described. Additional details and data presentation would enhance the clarity and replicability of the study. I kindly request the authors to consider the following points: 

      i) A schematic drawing or diagram illustrating the experimental setup for the temperature preference assay would greatly aid readers in understanding the spatial arrangement of the apparatus, temperature points, and the positioning of flies during the assay. The drawing should also be accompanied by specific details about the setup (dimensions, material, etc). 

      Thank you for your suggestions. We have added the schematic drawing in Fig. S1.

      ii) It would be beneficial to include a visual representation of the distribution of flies within the temperature gradient on the apparatus. A graphical representation, such as a heatmaps or histograms, showing the percentage of flies within each one-degree temperature bin, would offer insights into the preferences and behaviors of the flies during the assay. In addition to the detailed description of the assay and data analysis, the inclusion of actual data plots, especially for key findings or representative trials, would provide readers with a more direct visualization of the experimental outcomes. These additions will not only enhance the clarity of the presented information but also provide the reader with a more comprehensive understanding of the experimental setup and results. I appreciate the authors' attention to these points and look forward to the potential inclusion of these elements in the revised manuscript. 

      Thank you for the advice. We have added the heat map for WT and Gr64fGal4>CsChrimson data in Fig. S2. 

      Reviewer #3 (Public Review): 

      Summary: 

      The manuscript by Yujiro Umezaki and colleagues aims to describe how taste stimuli influence temperature preference in Drosophila. Under starvation flies display a strong preference for cooler temperatures than under fed conditions that can be reversed by refeeding, demonstrating the strong impact of metabolism on temperature preference. In their present study, Umezaki and colleagues observed that such changes in temperature preference are not solely triggered by the metabolic state of the animal but that gustatory circuits and peptidergic signalling play a pivotal role in gustation-evoked alteration in temperature preference. 

      The study of Umezaki is definitively interesting and the findings in this manuscript will be of interest to a broad readership. 

      Strengths: 

      The authors demonstrate interesting new data on how taste input can influence temperature preference during starvation. They propose how gustatory pathways may work together with thermosensitive neurons, peptidergic neurons and finally try to bridge the gap between these neurons and clock genes. The study is very interesting and the data for each experiment alone are very convincing. 

      Weaknesses: 

      In my opinion, the authors have opened many new questions but did not fully answer the initial question - how do taste-sensing neurons influence temperature preferences? What are the mechanisms underlying this observation? Instead of jumping from gustatory neurons to thermosensitive neurons to peptidergic neurons to clock genes, the authors should have stayed within the one question they were asking at the beginning. How does sugar sensing influence the physiology of thermos-sensation in order to change temperature preference? Before addressing all the following question of the manuscript the authors should first directly decipher the neuronal interplay between these two types of neurons. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Figure S3D is cited before S2, so please rearrange the numbering.

      Thank you. We have changed the numbering.

      I would also suggest a different color to visualize the data points in Figure S3, as some are barely visible on the dark bars (e.g. on a dark green background). 

      We have revised the figures. The data points were changed to smaller opened circles. 

      Reviewer #2 (Recommendations For The Authors): 

      *Please, expand on the experimental procedure, and describe the assay in detail. 

      We have added a scheme for the assay in Fig. S1 and also have revised the manuscript and figures.

      *Show the distribution of the gradient data that the preference values are based upon. Not necessarily for all, but for select key experiments. Heatmaps for each replicate (stacked on top of each other) would be a nice way of showing this. Simple histograms would of course work as well. 

      We have added heatmaps of selected key experiments that were added in Fig. S2. We have revised the manuscript and figures, correspondingly.

      Reviewer #3 (Recommendations For The Authors  

      The manuscript by Yujiro Umezaki and colleagues aims at describing how taste stimuli influence temperature preference in Drosophila. Under starvation, flies display a strong preference for cooler temperatures than under-fed conditions that can be reversed by refeeding, demonstrating the strong impact of metabolism on temperature preference. In their present study, Umezaki and colleagues observed that such changes in temperature preference are not solely triggered by the metabolic state of the animal but that gustatory circuits play a pivotal role in temperature preference. The study of Umezaki is definitively interesting and the findings in this manuscript will be of interest to a broad readership. However, I would like to draw the authors' attention to some points of concern: 

      The title to me sounds somehow inadequate. The definition of homeostasis (Cambridge Dictionary) is as follows: "the ability or tendency of a living organism, cell, or group to keep the conditions INSIDE it the same despite any changes in the conditions around it, or this state of internal balance". What do the authors mean by homeostatic temperature control? Reading the title not knowing much about poikilotherm insects I would understand that the authors claim that Drosophila can indeed keep a temperature homeostasis as mammals do. As Drosophila is not a homoiotherm animal and thus cannot keep its body temperature stable the title should be amended.  

      Homeostasis means a state of balance between all the body systems necessary for the body to survive and function properly. Drosophila are ectotherms, so the source of temperature comes from the environment, and their body temperature is very similar to that of their environment. However, the flies' temperature regulation is not simply a passive response to temperature. Instead, they actively seek a temperature based on their internal state. We have shown that the preferred temperature increases during the day and decreases during the night, showing a circadian rhythm of temperature preference (TPR). Because their environmental temperature is very close to their body temperature, TPR gives rise to body temperature rhythms (BTR). We have shown that TPR is similar to BTR in mammals. (Kaneko et al., Current Biology 2012 and Goda et al., JBR 2023). Similarly, we showed that the hungry flies choose a lower temperature so that the body temperature is also lower. Therefore, our data suggest that the fly maintains its homeostasis by using the environmental temperature to adjust its body temperature to an appropriate temperature depending on its internal state. Therefore, I would like to keep the title as "Taste triggers a homeostatic temperature control in hungry flies" We have added more explana1on in the Introduc1on and Discussion.

      Accordingly, the authors compare the preference of flies to cooler temperatures to the reduced body temperature of mammals (Lines 64 - 65). However, according to the cited literature the reduced body temperature in starved rats is discussed to reduce metabolic heat production (Sakurada et al., 2000). The authors should more rigorously give a short summary of the findings in the cited papers and the original interpretation to help the reader not get confused.

      In flies, it has been shown that a lower temperature means a lower metabolic rate, and a higher temperature means a higher metabolic rate. Therefore, hungry flies choose a lower temperature where their metabolic rate is lower and they do not need as much heat.

      Similarly, in mammals, starvation causes a lower body temperature, hypothermia. Body temperature is controlled by the balance between heat loss and heat production. The starved mammals showed lower heat production. We have added this information to the introduction. 

      The authors show that 5 min fly food refeeding causes a par3al recovery of the naïve temperature preference of the flies (Figure 1B) and that feeding of sucralose par3ally rescues the preference whereas glucose rescues the preference similar to refeeding with fly food would do. As glucose is both sweet and metabolically valuable it would be clearer for the reader if the authors start with the fly food experiment and then show the glucose experiment to show that the altered temperature preference depends on the food component glucose. From there they can further argue that glucose is both sweet (hedonic value) and metabolically valuable. And to disentangle sweetness from metabolism one needs a sugar that is sweet but cannot be metabolized - sucralose. 

      Thank you for your advice. Since the data with sucralose is the one we want to highlight the most, we decided to present it in the order of sucralose, glucose, and fly food.

      In the sucralose experiment the authors omit the 5 min data point and only show the 10 min time point. As Figure 1F indicates that both Glucose and Sucralose elicit the same attractiveness in the flies and that sweetness influences the temperature preference, it is important that the authors show the 5 min temperature preference too to underline the effect of the sweet taste stimulus on the fly behavior independent from the caloric value. Further, the authors should demonstrate not only the cumulative touches but how much sucralose or glucose may already be consumed by the fly in the depicted time frames. 

      It is interesting to see how much sucralose or glucose the flies consume over the time frames shown. Although the cumula1ve exposure to sugar is ideally equivalent to the amount of sugar, we need a different way to actually measure the amount of sugar. We will now emphasize "cumulative touches" rather than "amount of sugar" in the text. In the next study, we will look at how much sucralose or glucose the fly has already consumed.

      Sucralose and Glucose have a similar molecular structure - it would be interesting to see how the sweet taste of a sugar with a different molecular structure like fructose and its receptor Gr43b (Myamato & Amrein 2014) may contribute to temperature preferences.  

      Sucralose and Glucose are not structurally similar. That said, we tested fructose refeeding anyway. The hungry flies showed a taste-evoked warm preference after fructose refeeding. We have added data in Figure 1E and F. The data suggest that sweet taste is more important than sugar structure. We also tested Gr43b>CsChrimson. However, the flies do not show the taste-evoked warm preference (data not shown). The data suggest that Gr43b is not the major receptor controlling taste-evoked warm preference. We have revised the manuscript.

      Both sugars appear similarly attractive to the flies (Figure 1F) - are water, sucralose, and glucose presented in a choice assay or are these individually in separate experiments? 

      Water, sucralose, and glucose were individually presented in separate experiments. We clarified it in the figure legend.

      Subsequently, the authors address the question of how sweet taste may influence temperature preferences in flies. To this end, the authors first employ gustatory receptor mutants for Gr5a, Gr64a, and Gr61a and demonstrate that sucralose feeding does not rescue temperature preference in the absence of sweet taste receptors. In an alternative approach, the authors do not use mutants but an expression of UAS:Kir in Gr64F neurons. Taking a closer look at the graph it appears that the Kir expressing flies have an increased (nearly 1{degree sign}C) temperature preference than the starved mutant flies. Is this preference change related to the mutation directly and what would be the result if Kir would be conditionally only expressed after development is completed, or is the observed temperature preference related to the Gr64f-Gal4 line? If the latter would be the case perhaps the authors may want to bring the flies to the same genetic background to allow for a more direct comparison of the temperature preferences. 

      The Gr64fGal4>Kir flies show a ~one degree higher preferred temperature under starvation compared to the mutants. However, the phenotype is similar to the controls, Gr64fGal4/+ flies, under starvation. Therefore, this phenotype is not due to either the mutation or the Kir effect. Most importantly, the Gr64fGal4>Kir flies failed to show a taste-evoked warm preference. Together with other mutant data, we concluded that sweet GRNs are required for taste-evoked warm preference.

      Overall, the figure legend for Figure 2 is very cryptic and should be more detailed.

      We have revised the figure legend for Figure 2. 

      To shed light on the mechanisms underlying the changes in temperature preferences through gustatory stimuli the authors next blocked heat and cold sensing neurons in fed and starved flies and found out that TrpA1 expressing anterior cells and R11F02-Gal4 expressing neurons both participate in sweetness-induced alteration of temperature preference in starved animals. At this point, it should be explicitly indicated in the figure that the flies need more than one overnight starva3on to display the behavior (Figure 3A). 

      We have revised the manuscript.

      The data provided by the authors indicate a kind of push-and-pull mechanism between heat and cold-sensing neurons under starvation that is somehow influenced by sweet taste sensing. Further, the authors demonstrate that TrpA1-as well as R11F02-Gal4 driven Chrimson activation is sufficient to partially rescue temperature preference under starvation. At this point is unclear why the authors use a tubGal80ts expression system but not for the TrpA1SH-Gal4 driven Chrimson. As the development itself and the conditions under which the animals were raised may have influence on the temperature preference it is important that both groups are equally raised if the authors want to directly compare with each other. 

      As we wrote in the Material and Method, the R11F02-Gal4>uas-CsChrimson flies died during the development. Therefore, we had to use tubGal80ts. On the other hand, the TrpA1-Gal4>CsChrimson flies can survive to adults. As we mentioned in MS, all flies were treated with ATR after they had fully developed into adults. This means that both TrpA1-Gal4 and R11F02-Gal4 expressing cells are ac1vated by red light via CsChrimson only in adult stages. We carefully revised the MS.

      It is a pity that the authors at this point have decided to not deepen the understanding of the circuitry between thermo-sensation and metabolic homeostasis but subsequently change the focus of their study to investigate how internal state influences taste-evoked warm preference in hungry flies. Using mutants for NPF and sNPF the authors demonstrate that both peptides play a pivotal role in taste-evoked warm preference after sucrose feeding but not for nutrient-induced warm preference. Similarly, they found that DH44, AKH and dILP6, Upd2 and Upd3 neurons are also required for taste-evoked warm preference but not for nutrient-induced warm preference. Here again, the authors do not keep the systems stable and change between inhibition of neurons through Kir and mutants for peptides. For a better comparison, it would be preferable to use always exactly the same technique to inhibit neuron signalling.

      It would be interesting to find the neural circuity of thermo-sensation and metabolic homeostasis, but we do not have any luck so far. We will continue to look into the neural circuits which control taste-evoked warm preference and nutrient-induced warm preference. Since UAS-Kir is such a strong reporter, it may kill the flies sometime. So we couldn't use UAS-Kir for all Gal4 flies. 

      DH44 is expressed in the brain and in the abdominal ganglion where they share the expression pattern with 4 Lk neurons per hemisphere. Seeing the impact of Lk signalling in metabolism (AlAnzi et al., 2010) the authors should provide evidence that the observed effect is indeed because of DH44 and not Lk.

      It would be interesting to see if Lk may play a role in taste-evoked warm preference and/or nutrient-induced warm preference. We would like to systematically screen which neuropeptides and receptors are involved in the behavior in the next study. 

      Seeing the results on dILP6 it is interesting that Li and Gong (2015) could show in larvae that cold-sensing neurons directly interact with dILP neurons in the brain. It would be interesting to see whether similar circuitry may exist in adult flies to regulate temperature preferences and these peptidergic neurons. Further, it appears interesting that again these animals need much longer time to display the observed shift in temperature (which again should be clearly indicated in the figure legend too). These observations should be more carefully considered in the discussion part too.

      We have revised the manuscript.

      In the last part of the study, the authors investigate how sensory input from temperature-sensitive cells may transmit information to central clock neurons and how these in turn may influence temperature preference under starvation. The experiments assume that DH44-expressing neurons play a role in the output pathway of the central clock. Using the clock gene null mutants per and tim the authors show that even though the animals display a significant starvation response neither per nor tim mutants exhibited taste-evoked warm preference, indicating a taste but not nutrient-evoked temperature preference regulation. 

      The authors demonstrate interesting new data on how taste input can influence temperature preference during starvation. They propose how gustatory pathways may work together with thermosensitive neurons, peptidergic neurons and finally try to bridge the gap between these neurons and clock genes. The study is very interesting and the data for each experiment alone are very convincing. However, in my opinion, the authors have opened many new questions but did not fully answer the initial question - how do taste-sensing neurons influence temperature preferences? What are the mechanisms underlying this observation? Instead of jumping from gustatory neurons to thermosensitive neurons to peptidergic neurons to clock genes, the authors should have stayed within the one question they were asking at the beginning. How does sugar sensing influence the physiology of thermos-sensation? Before addressing all the following questions of the manuscript the authors should first directly decipher the neuronal interplay between these two types of neurons. 

      Thank you for your suggestion. It would be interesting to find the neural circuity of thermo-sensation and metabolic homeostasis. We have tried but there is no luck so far. 

      The authors could e.g., employ Ca or cAMP-imaging in anterior or cold-sensitive cells and see how the responsiveness of these cells may be altered after sugar feeding. Or at least follow the idea of Li and Gong about the thermos-regulation of dILP-expressing neurons. 

      Thank you for your suggestion. Since we do not know how dlLP-expression neurons are involved in temperature response in the adult flies. We will focus on the cells using Calcium imaging for the next study.

      Anatomical analysis using the GRASP technique may further help to understand the interplay of these neurons and give new insights into the circuitry underlying food preference alteration under starvation. 

      Thank you for your suggestion. It would be interesting to find the neural circuity of thermo-sensation and metabolic homeostasis. We have tried but there is no luck so far.  

      Minor comments: 

      Line 51: Hungry animals are desperate for food - I think the authors should not anthropomorphize at this point too\ much but rather strictly describe how the animals change their behavior without any interpretation of the mental state of the animal. 

      We have modified the manuscript.

      Line 80: Hunger and satiety dramatically affect animal behavior and physiology and control feeding - please not only cite the papers but also give a short overview of the cited papers on which behaviors are altered and how. 

      We have revised the manuscript. 

      Overall statistic: The authors do comparative statistics always against starved animals throughout but often state in the text a comparison against fed (Line 111: "but did not reach that of the fed flies") I think the authors should describe the date according to their statistics and keep this constant throughout the paper. 

      Sorry for the confusion. We originally had it, but we removed it. We have added the additional statistical analyses.  

      Figure legends: Overall the figure legends could be more developed and more detailed.

      We have revised the manuscript.

    2. eLife assessment

      This paper presents valuable findings that gustation and nutrition might independently influence the preferred environmental temperature in flies. The evidence supporting the main claims is solid and well presented. The finding that flies might thus exhibit a cephalic phase response similar to mammals will be of value for future investigations.

    3. Reviewer #1 (Public Review):

      Summary:

      This paper presents valuable findings that gustation and feeding state influence the preferred environmental temperature preference in flies. Interestingly, the authors showed that by refeeding starved animals with non-nutritive sugar sucralose, they are able to tune their preference towards a higher temperature in addition to nutrient-dependent warm preference. The authors show that temperature sensing and sweet sensing gustatory neurons (SGNs) are involved in the former but not the latter. In addition, their data indicate that peptidergic signals involved in internal state and clock genes are required for taste-dependent warm preference behavior.

      The authors made an analogy of their results to the cephalic phase response (CPR) in mammals where the thought, sight and taste of food prepares the animal for the consumption of food and nutrients. The authors showed that taste triggers CPR-induced temperature preference behaviors in flies. The authors also briefly covered that the combined modalities of smell and taste induced CPR responses, showing that starved orco mutant flies failed to recover temperature preference after refeeding with sucralose.

      The findings of this work hold promising future research prospects, for example, whether the sight of food influences temperature preference behavior in hungry flies, or whether taste, smell and sight work together or independently in promoting CPR responses.

      Futhermore, these valuable behavioral results can be further investigated in flies with the advantage of being able to dissect the neural circuitry underlying CPR and nutrient homeostasis.

      Strengths:

      (1) The authors convincingly showed that tasting is sufficient to drive warm temperature preference behavior in starved flies and show that it is independent of nutrient-driven warm preference.<br /> (2) By using the genetic manipulation of key internal sensors and genes controlling internal feeding and sleep state such as DH44 neurons and the per genes for eg the authors linked gustation and temperature preference behavior control to the internal state of the animal.

      Weaknesses:

      Most of the weaknesses of the paper have been addressed in the revision. The points mentioned below are meant to improve readability of the paper and to promote understanding of the significance of the work.<br /> (1) Supplementary fig 1 could replace Figure 1A. The purpose of Figure 1F is not clear to me as the comparison between the different food substances is not separately addressed anywhere in the text.<br /> (2) The data for the orco receptor mutant could be placed in the main figures to justify the discussion emphasising CPR-like responses.

    4. Reviewer #2 (Public Review):

      Animals constantly adjust behavior and physiology based on internal states. Hungry animals, desperate for food, exhibit physiological changes immediately upon sensing, smelling, or chewing food, known as the cephalic phase response (CPR), involving processes like increased saliva and gastrointestinal secretions. While starvation lowers body temperature, the mechanisms underlying how the sensation of food without nutrients induces behavioral responses remain unclear. Hunger stress induces changes in both behavior and physiological responses, which in flies (or at least in Drosophila melanogaster) leads to a preference for lower temperatures, analogous to the hunger-driven lower body temperature observed in mammals. In this manuscript, the authors have used Drosophila melanogaster to investigate the issue of whether taste cues can robustly trigger behavioral recovery of temperature preference in starving animals. The authors find that food detection triggers a warm preference in flies. Starved flies recover their temperature preference after food intake, with a distinction between partial and full recovery based on the duration of refeeding. Sucralose, an artificial sweetener, induces a warm preference, suggesting the importance of food-sensing cues. The paper compares the effects of sucralose and glucose refeeding, indicating that both taste cues and nutrients contribute to temperature preference recovery. The authors show that that sweet gustatory receptors (Grs) and sweet GRNs (Gustatory Receptor Neurons) play a crucial role in taste-evoked warm preference. Optogenetic experiments with CsChrimson support the idea that the excitation of sweet GRNs leads to a warm preference. The authors then examine the internal state's influence on taste-evoked warm preference, focusing on neuropeptide F (NPF) and small neuropeptide F (sNPF), analogous to mammalian neuropeptide Y. Mutations in NPF and sNPF result in a failure to exhibit taste-evoked warm preference, emphasizing their role in this process. However, these neuropeptides appear not to be critical for nutrient-induced warm preference, as indicated by increased temperature preference during glucose and fly food refeeding in mutant flies. The authors also explore the role of hunger-related factors in regulating taste-evoked warm preference. Hunger signals, including diuretic hormone (DH44) and adipokinetic hormone (AKH) neurons, are found to be essential for taste-evoked warm preference but not for nutrient-induced warm preference. Additionally, insulin-like peptide 6 (Ilp6) and Unpaired3 (Upd3), related to nutritional stress, are identified as crucial for taste-evoked warm preference. The investigation then extends into circadian rhythms, revealing that taste-evoked warm preference does not align with the feeding rhythm. While flies exhibit a rhythmic feeding pattern, taste-evoked warm preference occurs consistently, suggesting a lack of parallel coordination. Clock genes, crucial for circadian rhythms, are found to be necessary for taste-evoked warm preference but not for nutrient-induced warm preference.

      Strengths:

      A well-written and interesting study, investigating an intriguing issue. The claims, none of which to the best of my knowledge controversial, are backed by a substantial number of experiments.

      Weakness:

      The experimental setup used and the procedures for assessing the temperature preferences of flies is rather sparingly described. Additional details and data presentation would enhance the clarity and replicability of the study. I kindly request the authors to consider the following points: i) A schematic drawing or diagram illustrating the experimental setup for the temperature preference assay would greatly aid readers in understanding the spatial arrangement of the apparatus, temperature points, and the positioning of flies during the assay. The drawing should also be accompanied by specific details about the setup (dimensions, material, etc). ii) It would be beneficial to include a visual representation of the distribution of flies within the temperature gradient on the apparatus. A graphical representation, such as a heatmaps or histograms, showing the percentage of flies within each one-degree temperature bin, would offer insights into the preferences and behaviors of the flies during the assay. In addition to the detailed description of the assay and data analysis, the inclusion of actual data plots, especially for key findings or representative trials, would provide readers with a more direct visualization of the experimental outcomes. These additions will not only enhance the clarity of the presented information but also provide the reader with a more comprehensive understanding of the experimental setup and results. I appreciate the authors' attention to these points and look forward to the potential inclusion of these elements in the revised manuscript.

      Update: The revised manuscript now includes heatmaps showing the distribution of the flies across the temperature bins. As well as a schematic drawing of the behavioral setup.

    1. eLife assessment

      In this manuscript, Jain and colleagues explore whether increasing adult-born neurons is protective against status epilepticus and the development of spontaneous recurrent seizures (chronic epilepsy) in a mouse pilocarpine model of temporal lobe epilepsy. This is an important work that provides solid data, contradicting previous studies on suppressing chronic seizures by reduction in adult-born neurons.

    2. Reviewer #1 (Public Review):

      Summary:

      As adult-born granule neurons have been shown to play diverse roles, both positive and negative, to modulate hippocampal circuitry and function in epilepsy, understanding the mechanisms by which altered neurogenesis contribute to seizures is important for future therapeutic strategies. The work by Jain et al., demonstrates that increasing adult-born neurons (not increasing adult neurogenesis because BrdU birthdating was not performed in this study) before status epilepticus (SE) leads to a suppression in chronic seizures in the pilocarpine model of temporal lobe epilepsy. This work is potentially interesting because previous studies showed suppressing adult-born neurons led to reduced chronic seizures.

      To increase adult-born neurons, the authors conditionally delete the pro-apoptotic gene Bax using a tamoxifen inducible Nestin-CreERT2 which has been previously published to increase proliferation and survival of adult-born neurons by Sahay et al. (although this was not shown in this study). After 6 weeks of tamoxifen injection, the authors subject male and female mice to pilocarpine induced SE. In the first study, at 2 hours after pilocarpine, the authors examine latency to the first seizure, severity and total number of acute seizures, and power during SE. In the second study in a separate group of mice, the authors examine chronic seizure number and frequency, seizure duration, postictal depression, and seizure distribution/cluster seizures for 3 weeks after pilocarpine. Overall, the study concludes that increasing adult-born neurons in the normal adult brain can reduce epilepsy in females specifically.

      Strengths:

      (1) The study is sex matched and reveals differences in response to increasing adult-born neurons in chronic seizures between male and females.

      (2) The EEG recording parameters are stringent, and analysis of chronic seizures is comprehensive. In two separate experiments, the electrodes were implanted to record EEG from cortex as well as hippocampus. The recording is done for 10 hours post pilocarpine to analyze acute seizures, and for 3 weeks continuous video EEG recording was done to analyze chronic seizures.

      Weaknesses:

      (1) Increased DCX alone (without birthdating with BrdU) could indicate increased survival of adult-born neurons, not proliferation or birth of newborn neurons per se. While prior work has demonstrated that tamoxifen injection in adult mice showed an increase in dentate gyrus neurogenesis based on studies of BrdU, Ki67, and DCX (Sahay et al., 2011), the dynamics of adult-born neurons (proliferation, differentiation, and/or survival) could be different in epileptic (pilocarpine-treated) animals. Other stages, e.g., proliferation of neural precursors or maturation of adult-born dentate granule cells, was not examined. Analysis of additional stages of adult neurogenesis may reveal additional cellular understanding and add impact of the work on the field.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      As adult-born granule neurons have been shown to play diverse roles, both positive and negative, to modulate hippocampal circuitry and function in epilepsy, understanding the mechanisms by which altered neurogenesis contributes to seizures is important for future therapeutic strategies. The work by Jain et al. demonstrates that increasing adult neurogenesis before status epilepticus (SE) leads to a suppression of chronic seizures in the pilocarpine model of temporal lobe epilepsy. This work is potentially interesting because previous studies showed suppressing neurogenesis led to reduced chronic seizures.

      To increase neurogenesis, the authors conditionally delete the pro-apoptotic gene Bax using a tamoxifen-inducible Nestin-CreERT2 which has been previously published to increase proliferation and survival of adult-born neurons by Sahay et al. After 6 weeks of tamoxifen injection, the authors subjected male and female mice to pilocarpine-induced SE. In the first study, at 2 hours after pilocarpine, the authors examine latency to the first seizure, severity and total number of acute seizures, and power during SE. In the second study in a separate group of mice, at 3 weeks after pilocarpine, the authors examine chronic seizure number and frequency, seizure duration, postictal depression, and seizure distribution/cluster seizures. Overall, the study concludes that increasing adult neurogenesis in the normal adult brain can reduce epilepsy in females specifically. However, important BrdU birthdating experiments in both male and female mice need to be included to support the conclusions made by the authors. Furthermore, speculative mechanisms lacking direct evidence reduce enthusiasm for the findings.

      There are two suggestions. First, BrdU birthdating of newborn neurons is important to add to the paper so that there is support for the conclusions. Second, speculative text reduced enthusiasm. In response, we clarified the conclusions. We do not think that the clarified conclusions require BrdU birthdating (discussed further below). We also removed two schematics (and associated text) that we think the reviewer was referring to when speculation was mentioned.

      We also want to point out something minor -that the times of injections listed above are not correct.

      a. Seizures were not measured 2 hrs after pilocarpine; that is when the anticonvulsant diazepam was administered to males. 

      b. Seizures were not measured 3 weeks after pilocarpine; the duration of recording was 3 weeks.  

      (1) BrdU birthdating is required for conclusions.

      We think that the Reviewer was suggesting birthdating because we were not clear about our conclusions, and we apologize for the confusion. The Reviewer stated that we concluded: “conditionally deleting Bax in Nestin-Cre+ cells leads to increased neurogenesis and hilar ectopic granule cells, thereby reducing chronic seizures.”  (Note this is a quote from the review).

      However, we did not intend to conclude that. We intended to conclude that conditionally deleting Bax in Nestin-Cre+ mice reduced chronic seizures in the mouse model of epilepsy that we used. Also, that conclusion only pertained to females. Please note we did not conclude that hilar ectopic granule cells led to reduced seizures. We also concluded that Bax deletion increased neurogenesis in female mice. We have revised the text to make the conclusions clear.

      Abstract, starting on line 67:

      The results suggest that selective Bax deletion to increase adult neurogenesis can reduce experimental epilepsy, and the effect shows a striking sex difference.

      Results, starting on line 448:

      Because Cre+ epileptic females had increased numbers of immature neurons relative to Cre- females at the time of SE, and prior studies show that Cre+ females had less neuronal damage after SE (Jain et al., 2019), female Cre+ mice might have had reduced chronic seizures because of high numbers of immature neurons. However, the data do not prove a causal role.

      Starting on line 477:

      ...we hypothesized that female Cre+ mice would have fewer hilar ectopic GCs than female Cre- mice. However, that female Cre+ mice did not have fewer hilar ectopic GCs.

      Discussion, starting on line 563:

      The chronic seizures, measured 4-7 weeks after pilocarpine, were reduced in frequency by about 50% in females. Therefore, increasing young adult-born neurons before the epileptogenic insult can protect against epilepsy. However, we do not know if the protective effect was due to the greater number of new neurons before SE or other effects. Past data would suggest that increased numbers of newborn neurons before SE leads to a reduced SE duration and less neuronal damage in the days after SE. That would be likely to lessen the epilepsy after SE. However, there may have been additional effects of larger numbers of newborn neurons prior to SE.

      Conclusions, starting on line 745:

      In the past, suppressing adult neurogenesis before SE was followed by fewer hilar ectopic GCs and reduced chronic seizures. Here, we show that the opposite - enhancing adult neurogenesis before SE and increased hilar ectopic GCs - do not necessarily reduce seizures. We suggest instead that protection of the hilar neurons from SE-induced excitotoxicity was critical to reducing seizures. The reason for the suggestion is that the survival of hilar neurons would lead to persistence of the normal inhibitory functions of hilar neurons, protecting against seizures. However, this is only a suggestion at the present time because we do not have data to prove it. Additionally, because protection was in females, sex differences are likely to have played an important role. Regardless, the results show that enhancing neurogenesis of young adult-born neurons in Nestin-Cre+ mice had a striking effect in the pilocarpine model, reducing chronic seizures in female mice.

      The Reviewer is correct that it would be interesting to know when the increase in adult neurogenesis occurred that was critical to the effect. For example, was it the initial increase following Bax deletion but before pilocarpine-induced SE, or the increase in neurogenesis following SE, or increased adult neurogenesis in the chronic stage of epilepsy. It also might be that related aspects of neurogenesis played a role such as the degree that maturation was normal in adult-born neurons. We have not pursued the experiments to identify these aspects of neurogenesis because of how much work it would entail. Also, approaches to conclude cause-effect relationships are going to be difficult. 

      (2) Speculation.

      We removed the text and supplemental figures with schematics that we think were the overly speculative parts of the paper the Reviewer mentioned.

      Strengths:

      (1) The study is sex-matched and reveals differences in response to increasing adult neurogenesis in chronic seizures between males and females.

      (2) The EEG recording parameters are stringent, and the analysis of chronic seizures is comprehensive. In two separate experiments, the electrodes were implanted to record EEG from the cortex as well as the hippocampus. The recording was done for 10 hours post pilocarpine to analyze acute seizures, and for 3 weeks continuous video EEG recording was done to analyze chronic seizures.

      Weaknesses:

      (1) Cells generated during acute seizures have different properties to cells generated in chronic seizures. In this study, the authors employ two bouts of neurogenesis stimuli (Bax deletion dependent and SE dependent), with two phases of epilepsy (acute and chronic). There are multiple confounding variables to effectively conclude that conditionally deleting Bax in Nestin-Cre+ cells leads to increased neurogenesis and hilar ectopic granule cells, thereby reducing chronic seizures.

      As mentioned above, with a clarification of our conclusions we think we have addressed the concern. We believe that we conditionally deleted Bax in Nestin-expressing cells. We believe we found that female mice had reduced loss of hilar mossy cells and somatostatin-expressing neurons after SE, and fewer chronic seizures after SE. While it makes sense that increased neurogenesis caused the reduced seizures, we acknowledge it was not proved.

      We do not make conclusions about the role of hilar ectopic granule cells. However, we note that they appear to have been similar in number across groups, which suggests they played no role in the results. This is very surprising and therefore adds novelty.

      (2) Related to this is the degree of neurogenesis between Cre+ and Cre- mice and the nature of the sex differences. It is crucial to know the rate/fold change of increased neurogenesis before pilocarpine treatment and whether it is different between male and female mice.

      We agree that if sex differences in adult neurogenesis could be shown by a sex difference in rate, fold change, maturation, and other characteristics.  However, sex differences can also be shown by a change in doublecortin (DCX), which is what we did. We respectfully submit that we do not see an exhaustive study is critical.

      As a result, we have clarified DCX was studied either before SE or in the period of chronic seizures:

      Results, starting on line 406:

      III. Before and after epileptogenesis, Cre+ female mice exhibited more immature neurons than Cre- female mice but that was not true for male mice.

      Starting on line 446:

      Therefore, elevated DCX occurred after chronic seizures had developed in Cre+ mice but the effect was limited to females.

      Discussion, starting on line 592:

      This study showed that conditional deletion of Bax from Nestin-expressing progenitors increased young adult-born neurons in the DG when studied 6 weeks after deletion and using DCX as a marker of immature neurons.

      (3) The authors observe more hilar Prox1 cells in Cre+ mice compared to Cre- mice. The authors should confirm the source of the hilar Prox1+ cells.

      This is an excellent question but it is unclear that it is critical to the seizures since both sexes showed more hilar Prox1 cells in Cre+ mice but only the females had fewer seizures than Cre- mice. This is the additional text to describe the results (starting on Line 493):

      In past studies, hilar ectopic GCs have been suggested to promote seizures (Scharfman et al., 2000; Jung et al., 2006; Cho et al., 2015). Therefore, we asked if the numbers of hilar ectopic GCs correlated with the numbers of chronic seizures. When Cre- and Cre+ mice were compared (both sexes pooled), there was a correlation with numbers of chronic seizures (Fig. 6D1) but it suggested that more hilar ectopic GCs improved rather than worsened seizures. However, the correlation was only in Cre- mice, and when sexes were separated there was no correlation (Fig. 6D3).

      When seizure-free interval was examined with sexes pooled, there was a correlation for Cre+ mice (Fig. 6D2) but not Cre- mice. Strangely, the correlations of Cre+ mice with seizure-free interval (Fig. 6D2, D4) suggest ectopic GCs shorten the seizure-free interval and therefore worsen epilepsy, opposite of the correlative data for numbers of chronic seizures. In light of these inconsistent results it seems that hilar ectopic granule cells had no consistent effect on chronic seizures.

      (4) The biggest weakness is the lack of mechanism. The authors postulate a hypothetical mechanism to reconcile how increasing and decreasing adult-born neurons in GCL and hilus and loss of hilar mossy and SOM cells would lead to opposite effects - more or fewer seizures. The authors suggest the reason could be due to rewiring or no rewiring of hilar ectopic GCs, respectively, but do not provide clear-cut evidence.

      As we mention above, we removed the supplemental figures with schematics because they probably were what seemed overly speculative.

      We acknowledge that mechanism is not proven by our study. However, we would like to mention that in our view, showing preservation of hilar mossy cells and SOM cells, but not PV cells, does add mechanistic data to the paper. We understand more experiments are necessary.

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Jain et al explore whether increasing adult neurogenesis is protective against status epilepticus (SE) and the development of spontaneous recurrent seizures (chronic epilepsy) in a mouse pilocarpine model of TLE. The authors increase adult neurogenesis via conditional deletion of Bax, a pro-apoptotic gene, in Nestin-CreERT2Baxfl/fl mice. Cre- littermates are used as controls for comparisons. In addition to characterizing seizure phenotypes, the authors also compare the abundance of hilar ectopic granule cells, mossy cells, hilar SOM interneurons, and the degree of neuronal damage between mice with increased neurogenesis (Cre+) vs Cre- controls. The authors find less severe SE and a reduction in chronic seizures in female mice with pre-insult increased adult-born neurons. Immunolabeling experiments show these females also have preservation of hilar mossy cells and somatostatin interneurons, suggesting the pre-insult increase in adult neurogenesis is protective.

      Strengths:

      (1) The finding that female mice with increased neurogenesis at the time of pilocarpine exposure have fewer seizures despite having increased hilar ectopic granule cells is very interesting.

      (2) The work builds nicely on the group's prior studies.

      (3) Apparent sex differences are a potentially important finding.

      (4) The immunohistochemistry data are compelling.

      (5) Good controls for EEG electrode implantation effects.

      (6) Nice analysis of most of the SE EEG data.

      Weaknesses:

      (1) In addition to the Cre- littermate controls, a no Tamoxifen treatment group is necessary to control for both insertional effects and leaky expression of the Nestin-CreERT2 transgene.

      About “leaky” expression, we have not found expression to be leaky. We checked by injecting a Cre-dependent virus so that mCherry would be expressed in those cells that had Cre.  The results were published as Supplemental Figure 9 in Jain et al. (2019).

      In the revised manuscript we also mention a study that examined three Nestin-CreERT2 mouse lines (Sun et al., 2014). One of the mouse lines was ours. The leaky expression was not in the mouse line we use. We have added these points to the revised manuscript:

      Methods, section II starting on line 791:

      Although Nestin-Cre-ERT2 mouse lines have been criticized because  they can have leaky expression, the mouse line used in the present study did not (Sun et al., 2014), which we confirmed (Jain et al., 2019).

      (2) The authors suggest sex differences; however, experimental procedures differed between male and female mice (as the authors note). Female mice received diazepam 40 minutes after the first pilocarpine-induced seizure onset, whereas male mice did not receive diazepam until 2 hours post-onset. The former would likely lessen the effects of SE on the female mice. Therefore, sex differences cannot be accurately assessed by comparing these two groups, and instead, should be compared between mice with matching diazepam time courses.

      We agree that a shorter delay between pilocarpine and diazepam would be likely to lead to less damage. However, the latency from pilocarpine to SE varied, making the time from the onset of SE to diazepam variable. Most of the variability was in females. By timing the diazepam injection differently in males and females, we could make the time from the onset of SE to diazepam similar between females and males. We had added a supplemental figure to show that our approach led to no significant differences between females and males in the latency to SE, time between SE and diazepam injection, and time between pilocarpine and diazepam injection. We also show that Cre+ females and Cre- females were not different in these times, so it could not be related to the neuroprotection of Cre+ females.

      Additionally, the authors state that female mice that received diazepam 2 hours post-onset had severe brain damage. This is concerning as it would suggest that SE is more severe in the female than in the male mice.

      We regret that our language was misleading. We intended to say females had more morbidity and mortality than males (lack of appetite and grooming, death in the days after SE) when we gave DZP 2 hrs after Pilo. We actually don’t know why because there were no differences in severity of SE. We think the females had worse outcome when they had a short latency to SE.  These females had a longer period of SE before DZP than males, probably leading to worse outcome. To correct this we gave DZP to females sooner. Then morbidity and mortality was improved in females. 

      Interestingly, after we did this we saw females did not always have a short latency to SE. We maintained the same regimen however, to be consistent. As the new supplemental figure (above) shows, there were significant sex differences in the latency to SE, time between SE and DZP, and time between pilocarpine and DZP.

      (3) Some sample sizes are low, particularly when sex and genotypes are split (n=3-5), which could cause a type II statistical error.

      We agree and have noted this limitation in the Discussion:

      Additional considerations, starting on line 739:

      This study is limited by the possibilities of type II statistical errors in those instances where we divided groups by genotype and sex, leading to comparisons of 3-5 mice/group.

      (4) Several figures show a datapoint in the sex and genotype-separated graphs that is missing from the corresponding male and female pooled graphs (Figs. 2C, 2D, 4B).

      We are very grateful to the Reviewer for pointing out the errors. They are corrected.

      (5) In Suppl Figs. 1B & 1C, subsections 1c and 2c, the EEG trace recording is described as the end of SE; however, SE appears to still be ongoing in these traces in the form of periodic discharges in the EEG.

      The Reviewer is correct.  It is a misconception that SE actually ends completely. The most intense seizure activity may, but what remains is abnormal activity that can last for days. Other investigators observe the same and have suggested that it argues against the concept of a silent period between SE and chronic epilepsy. We had discussed this in our prior papers and had referenced how we define SE.  In the revised manuscript we add the information to the Methods section instead of referencing a prior study:

      Methods, starting on line 899:

      SE duration was defined in light of the fact that the EEG did not return to normal after the initial period of intense activity. Instead, intermittent spiking occurred for at least 24 hrs, as we previously described (Jain et al., 2019) and has been described by others (Mazzuferi et al., 2012; Bumanglag and Sloviter, 2018; Smith et al., 2018). We therefore chose a definition that captured the initial, intense activity. We defined the end of this time as the point when the amplitude of the EEG deflections were reduced to 50% or less of the peak deflections during the initial hour of SE. Specifically, we selected the time after the onset of SE when the EEG amplitude in at least 3 channels had dropped to approximately 2 times the amplitude of the EEG during the first hour of SE, and remained depressed for at least 10 min (Fig. S2 in (Jain et al., 2019). Thus, the duration of SE was defined as the time between the onset and this definition of the "end" of SE.

      (6) In Results section II.D and associated Fig.3, what the authors refer to as "postictal EEG depression" is more appropriately termed "postictal EEG suppression". Also, postictal EEG suppression has established criteria to define it that should be used.

      We find suppression is typical in studies of ECT or humans (Esmaeili et al., 2023; Gascoigne et al., 2023; Hahn et al., 2023; Kavakbasi et al., 2023; Langroudi et al., 2023; Karl et al., 2024; Vilan et al., 2024; Zhao et al., 2024) and animal research uses the term postictal depression(Kanner et al., 2010; Krishnan and Bazhenov, 2011; Riljak et al., 2012; Singh et al., 2012; Carballosa-Gonzalez et al., 2013; Kommajosyula et al., 2016; Smith et al., 2018; Uva and de Curtis, 2020; Medvedeva et al., 2023). Therefore we think depression is a more suitable term.

      The example traces in Fig. 3A and B should also be expanded to better show this potential phenomenon.

      We expanded traces in Fig. 3 as suggested. They are in Fig 3A.

      (7) In Fig.5D, the area fraction of DCX in Cre+ female mice is comparable to that of Cre- and Cre+ male mice. Is it possible that there is a ceiling effect in DCX expression that may explain why male Cre+ mice do not have a significant increase compared to male Cre- mice?

      We thank the Reviewer for the intriguing possibility. We now mention it in the manuscript:

      Results, starting on line 456:

      It is notable that the Cre+ male mice did not show increased numbers of immature neurons at the time of chronic seizures but Cre+ females did. It is possible that there was a “ceiling” effect in DCX expression that would explain why male Cre+ mice did not have a significant increase in immature neurons relative to male Cre- mice.

      (8) In Suppl. Fig 6, the authors should include DCX immunolabeling quantification from conditional Cre+ male mice used in this study, rather than showing data from a previous publication.

      We have made this revision.

      (9) In Fig 8, please also include Fluorojade-C staining and quantification for male mice.

      The additional data for males have been added to part D.

      (10) Page 13: Please specify in the first paragraph of the discussion that findings were specific to female mice with pre-insult increases in adult-born neurogenesis.

      This has been done.

      Minor:

      (11) In Fig. 1 and suppl. figure 1, please clarify whether traces are from male or female mice.

      We have clarified.

      (12) Please be consistent with indicating whether immunolabeling images are from female or male mice.

      a. Fig 5B images labeled as from "Cre- Females" and "Cre+ Females".

      b. Suppl. Fig 8: Images labeled as "Cre- F" and "Cre+ F".

      c. Fig 6: sex not specified.

      d. Fig. 7: sex only specified in the figure legend.

      e. Fig 8: only female mice were included in these experiments, but this is not clear from the figure title or legend.

      We revised all figures according to the comments.

      (13) Page 4: the last paragraph of the introduction belongs within the discussion section.

      We recognize there is a classic view that any discussion of Results should not be in the Introduction. However, we find that view has faded and more authors make a brief summary statement about the Results at the end of the Introduction. We would like to do so because it allow Readers to understand the direction of the study at the outset, which we find is helpful.

      (14) Page 6: The sentence "The data are consistent with prior studies..." is unnecessary.

      We have removed the text.

      (15) Suppl. Fig 6A: Please include representative images of normal condition DCX immunolabeling.

      We have added these data. There is an image of a Cre- female, Cre+ female, Cre- male and Cre+ male in the new figure, Supplemental Figure 6. All mice had tamoxifen at 6 weeks of age and were perfused 6 weeks later. None of the mice had pilocarpine.

      (16) In Suppl. Fig 7C, I believe the authors mean "no loss of hilar mossy and SOM cells" instead of "loss of hilar mossy and SOM cells".

      This Figure was removed because of the input from Reviewer 1 suggesting it was too speculative.

      Reviewer #1 (Recommendations For The Authors):

      (1) The main claim of the study is that increasing adult neurogenesis decreases chronic seizures. However, to quantify adult-born neurons, DCX immunoreactivity is used as the sole metric to determine neurogenesis. This is insufficient as changes in DCX-expressing cells could also be an indicator of altered maturation, survival, and/or migration, not proliferation per se. To claim that increasing adult neurogenesis is associated with a reduction of chronic seizures, the authors should perform a pulse/chase (birth dating) experiment with BrdU and co-labeling with DCX.

      We think that increased DCX does reflect increased adult neurogenesis. However, we agree that one does not know if it was due to increased proliferation, survival, etc. We also note that this mouse line has been studied thoroughly to show there was increased neurogenesis with BrdU, Ki67 and DCX. We mention that paper in the revised text:

      Methods, starting on line 786:

      It was shown that after tamoxifen injection in adult mice there is an increase in dentate gyrus neurogenesis based on studies of bromo-deoxyuridine, Ki67, and doublecortin (Sahay et al., 2011).

      (2) As mentioned above, analysis of DCX staining alone months after TAM injections is limited. Instead, the cells could be labelled by BrdU prior to TAM injection, following which quantification of BrdU+/Prox1+ cells at 6 weeks post TAM injection should be performed in Cre+ and Cre- mice (males and females) to yield the rate of neurogenesis increase.

      We respectfully disagree that birthdating cells is critical. Using DCX staining just before SE, we know the size of the population of cells that are immature at the time of SE. This is what we think is most important because these immature neurons are those that appear to affect SE, as we have already shown.

      (3) To confirm the source of the hilar Prox1+ cells, a dual BrdU/EdU labeling approach would be beneficial. BrdU injection could be given before TAM injection and EdU injection before pilocarpine to label different cohorts of neural stem cells. Co-staining with Prox1 at different time points will help in identifying the origin of hilar ectopic cells.

      We are grateful for the ideas of the Reviewer. We hesitate to do these experiments now because it seems like a new study to find out where hilar granule cells come from.

      REFERENCES

      Bumanglag AV, Sloviter RS (2018) No latency to dentate granule cell epileptogenesis in experimental temporal lobe epilepsy with hippocampal sclerosis. Epilepsia 59:2019-2034.

      Carballosa-Gonzalez MM, Munoz LJ, Lopez-Alburquerque T, Pardal-Fernandez JM, Nava E, de Cabo C, Sancho C, Lopez DE (2013) EEG characterization of audiogenic seizures in the hamster strain gash:Sal. Epilepsy Res 106:318-325.

      Cho KO, Lybrand ZR, Ito N, Brulet R, Tafacory F, Zhang L, Good L, Ure K, Kernie SG, Birnbaum SG, Scharfman HE, Eisch AJ, Hsieh J (2015) Aberrant hippocampal neurogenesis contributes to epilepsy and associated cognitive decline. Nat Commun 6:6606.

      Esmaeili B, Weisholtz D, Tobochnik S, Dworetzky B, Friedman D, Kaffashi F, Cash S, Cha B, Laze J, Reich D, Farooque P, Gholipour T, Singleton M, Loparo K, Koubeissi M, Devinsky O, Lee JW (2023) Association between postictal EEG suppression, postictal autonomic dysfunction, and sudden unexpected death in epilepsy: Evidence from intracranial EEG. Clin Neurophysiol 146:109-117.

      Gascoigne SJ, Waldmann L, Schroeder GM, Panagiotopoulou M, Blickwedel J, Chowdhury F, Cronie A, Diehl B, Duncan JS, Falconer J, Faulder R, Guan Y, Leach V, Livingstone S, Papasavvas C, Thomas RH, Wilson K, Taylor PN, Wang Y (2023) A library of quantitative markers of seizure severity. Epilepsia 64:1074-1086.

      Hahn T et al. (2023) Towards a network control theory of electroconvulsive therapy response. PNAS Nexus 2:pgad032.

      Jain S, LaFrancois JJ, Botterill JJ, Alcantara-Gonzalez D, Scharfman HE (2019) Adult neurogenesis in the mouse dentate gyrus protects the hippocampus from neuronal injury following severe seizures. Hippocampus 29:683-709.

      Jung KH, Chu K, Lee ST, Kim J, Sinn DI, Kim JM, Park DK, Lee JJ, Kim SU, Kim M, Lee SK, Roh JK (2006) Cyclooxygenase-2 inhibitor, celecoxib, inhibits the altered hippocampal neurogenesis with attenuation of spontaneous recurrent seizures following pilocarpine-induced status epilepticus. Neurobiol Dis 23:237-246.

      Kanner AM, Trimble M, Schmitz B (2010) Postictal affective episodes. Epilepsy Behav 19:156-158.

      Karl S, Sartorius A, Aksay SS (2024) No effect of serum electrolyte levels on electroconvulsive therapy seizure quality parameters. J ECT 40:47-50.

      Kavakbasi E, Stoelck A, Wagner NM, Baune BT (2023) Differences in cognitive adverse effects and seizure parameters between thiopental and propofol anesthesia for electroconvulsive therapy. J ECT 39:97-101.

      Kommajosyula SP, Randall ME, Tupal S, Faingold CL (2016) Alcohol withdrawal in epileptic rats - effects on postictal depression, respiration, and death. Epilepsy Behav 64:9-14.

      Krishnan GP, Bazhenov M (2011) Ionic dynamics mediate spontaneous termination of seizures and postictal depression state. J Neurosci 31:8870-8882.

      Langroudi ME, Shams-Alizadeh N, Maroufi A, Rahmani K, Rahchamani M (2023) Association between postictal suppression and the therapeutic effects of electroconvulsive therapy: A systematic review. Asia Pac Psychiatry 15:e12544.

      Mazzuferi M, Kumar G, Rospo C, Kaminski RM (2012) Rapid epileptogenesis in the mouse pilocarpine model: Video-EEG, pharmacokinetic and histopathological characterization. Exp Neurol 238:156-167.

      Medvedeva TM, Sysoeva MV, Sysoev IV, Vinogradova LV (2023) Intracortical functional connectivity dynamics induced by reflex seizures. Exp Neurol 368:114480.

      Riljak V, Maresova D, Jandova K, Bortelova J, Pokorny J (2012) Impact of chronic ethanol intake of rat mothers on the seizure susceptibility of their immature male offspring. Gen Physiol Biophys 31:173-177.

      Sahay A, Scobie KN, Hill AS, O'Carroll CM, Kheirbek MA, Burghardt NS, Fenton AA, Dranovsky A, Hen R (2011) Increasing adult hippocampal neurogenesis is sufficient to improve pattern separation. Nature 472:466-470.

      Scharfman HE, Goodman JH, Sollas AL (2000) Granule-like neurons at the hilar/CA3 border after status epilepticus and their synchrony with area CA3 pyramidal cells: Functional implications of seizure-induced neurogenesis. J Neurosci 20:6144-6158.

      Singh B, Singh D, Goel RK (2012) Dual protective effect of passiflora incarnata in epilepsy and associated post-ictal depression. J Ethnopharmacol 139:273-279.

      Smith ZZ, Benison AM, Bercum FM, Dudek FE, Barth DS (2018) Progression of convulsive and nonconvulsive seizures during epileptogenesis after pilocarpine-induced status epilepticus. J Neurophysiol 119:1818-1835.

      Sun MY, Yetman MJ, Lee TC, Chen Y, Jankowsky JL (2014) Specificity and efficiency of reporter expression in adult neural progenitors vary substantially among nestin-creer(t2) lines. J Comp Neurol 522:1191-1208.

      Uva L, de Curtis M (2020) Activity- and ph-dependent adenosine shifts at the end of a focal seizure in the entorhinal cortex. Epilepsy Res 165:106401.

      Vilan A, Grangeia A, Ribeiro JM, Cilio MR, de Vries LS (2024) Distinctive amplitude-integrated EEG ictal pattern and targeted therapy with carbamazepine in kcnq2 and kcnq3 neonatal epilepsy: A case series. Neuropediatrics 55:32-41.

      Zhao C, Tang Y, Xiao Y, Jiang P, Zhang Z, Gong Q, Zhou D (2024) Asymmetrical cortical surface area decrease in epilepsy patients with postictal generalized electroencephalography suppression. Cereb Cortex 34.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      Comment 1: One of the only demonstrations of the expression and physiological significance of TRPCs in VTA DA neurons was published by (Rasmus et al., 2011; Klipec et al., 2016) which are not cited in this paper. In their study, TRPC4 expression was detected in a uniformly distributed subset of VTA DA neurons, and TRPC4 KO rats showed decreased VTA DA neuron tonic firing and deficits in cocaine reward and social behaviors. Update: The authors say they have added a discussion of these papers, but I do not see it in the updated manuscript.

      We thank the reviewer for the suggestion. The discussion for this has been added (line 557-565).

      Comment 2: The authors should report the results (exact data values) of female mice in the results text, or pool the male and female data if the sex differences are not significant.

      We agree with reviewer. Some experiments were further redone with female and the data of male and female mice have been reported in the results of text.

      Comment 3: The selectivity of drugs should be referred as "selective" rather than "specific". 

      Thanks, “specific” has been changed to “selective”.  

      Comment 4: Line 62: typo, "substantia nigra". 

      Thanks, “substantial nigra” has been changed to “substantia nigra” in line 65.  

      Comment 5: Line 77: some new studies suggest that NALCN might have voltage dependency

      (rectification).

      Thanks, description of NALCN voltage dependence has been corrected in line 81-83.

      Comment 6: Line 175: change "less" to "fewer". 

      Thanks, “less” has been changed to “fewer”.

      Comment 7: Line 299: choose one - "was not ... or" or "was neither ... nor". 

      Thanks, this error has been corrected. 

      Comment 8: In Figure 1Aii and Figure 3Bi, it was not specified in the results text or figure legend that C1-C5 represent individual cell until the legend for Figure 4.

      Thanks, these description about gel have been added in the figure legends. 

      Reviewer #2 (Public Review): 

      Comment 1: From the previous review, we mentioned that " 'The HCN' as written in line 69 is a bit misleading, as HCN channels in the heart and brain are different members of a family of channels, although as written in the text, it seems that they are identical." This is still the case (now line 73).

      We agreed with the reviewer’s comments. The introduction about HCN has been corrected (line 74-78). 

      Comment 2: The authors state in line 112 that "most of the experiments were also repeated in female mice" - this is true in the case of most electrophysiological experiments, although not behavioral experiments. Authors should amend the statement in line 112 and clarify in the Discussion section which findings are generalizable between sexes; e.g.:

      a.  Discussion of HCN contribution to VTA DA activity (beginning line 453) should clarify male mice. 

      b.  Similarly, any discussion of behavioral findings should clarify male mice. 

      We agreed with the reviewer’s comments. The sexes of mice used have been noted in the results and discussion. 

      Comment 3: The authors' statement in lines 179-183 ("In contrast, fewer GABAergic neuronal markers (Glutamic acid decarboxylase, GAD1/2 and vesicular GABA transporter, VGAT) co-expressed with the DA neurons, which is consistent with previous studies that VTA DA neurons co-expressing GABAergic neuronal markers mainly project to the lateral habenula") is a little confusing - as stated, it seems that the authors are confirming DA/GABA coexpression in VTA-LHb neurons, which is not the case.

      We agreed with the reviewer’s comments. We corrected this statement (line 182-186).

      Comment 4: Additional information could be included in the Methods section description of Western Blotting procedures - e.g., what thickness of tissue and what size gauge were used to dissect VTA for these experiments?

      Thanks. The description of tissue in Western Blotting procedures has been added.

      Comment 5:

      a. Grammatical errors in line 23 of Abstract (also lines 31-32)

      b. "drove" should read "strove" in line 92 

      c. Grammatical errors in lines 401, 444, and 448 

      We thank the reviewer for pointing out grammatical errors and we corrected them.

      Reviewer #3 (Public Review): 

      Comment 1: The main strength of this study lies on a comprehensive bottom-up approach ranging from patch-clamp recordings to behavioral tasks. These tasks mainly address anxiety-like behaviors and so-called depression-like behaviors (sucrose choice, forced swim test, tail suspension test). The results gathered by means of these procedures are clearcut. However, the reviewer believes that the authors should be more cautious when interpreting immobility responses to stress (forced swim, tail suspension) as "depression-like" responses. These stress models have been routinely used (and validated) in the past to detect the antidepressant properties of compounds under investigation, which by no means indicates that these are depression models. For readers interested by this debate, I suggest to read e.g. De Kloet and Molendijk (Biol. Pscyhiatry 2021).

      We thank the reviewer for the suggestion. We will be more careful and rigorous in the selection of stress models in our subsequent research work.

      Editor's note:

      Should you choose to revise your manuscript, please include full statistical reporting including exact p-values wherever possible alongside the summary statistics (test statistic and df) and 95% confidence intervals. These should be reported for all key questions and not only when the p-value is less than 0.05.

      We have added the full statistical reporting including exact p-values wherever possible alongside the summary statistics (test statistic and df) and 95% confidence intervals into the results and the figure legends of the revised manuscript.

    2. eLife assessment

      This important study examined the mechanisms underlying reduced excitability of ventral tegmental area dopamine neurons in mice that underwent a chronic mild unpredictable stress treatment. The authors identify NALCN and TRPC6 channels as key mechanisms that regulate spontaneous firing of ventral tegmental area dopamine neurons and examined their roles in reduced firing in mice that underwent a chronic mild unpredictable stress treatment. The authors' conclusions on neurophysiological data are supported by multiple approaches and are convincing, although the relevance of the behavioral results to human depression remains unclear.

    3. Reviewer #1 (Public Review):

      Wang et al., present a paper aiming to identify NALCN and TRPC6 channels as key mechanisms regulating VTA dopaminergic neuron spontaneous firing and investigating whether these mechanisms are disrupted in a chronic unpredictable stress model mouse.

      Major strengths:

      This paper uses multiple approaches to investigate the role of NALCN and TRPC6 channels in VTA dopaminergic neurons.

    4. Reviewer #2 (Public Review):

      This paper describes the results of a set of complementary and convergent experiments aimed at describing roles for the non-selective cation channels NALCN and TRPC6 in mediating subthreshold inward depolarizing currents and action potential generation in VTA DA neurons under normal physiological conditions. In general, the authors have responded satisfactorily to reviewer comments, and the revised manuscript is improved.

    5. Reviewer #3 (Public Review):

      The authors of this study have examined which cation channels specifically confer to ventral tegmental area dopaminergic neurones their autonomic (spontaneous) firing properties. Having brought evidence for the key role played by NALCN and TRPC6 channels therein, the authors aimed at measuring whether these channels play some role in so-called depression-like (but see below) behaviors triggered by chronic exposure to different stressors. Following evidence for a down-regulation of TRPC6 protein expression in ventral tegmental area dopaminergic cells of stressed animals, the authors provide evidence through viral expression protocols for a causal link between such a down-regulation and so-called depression-like behaviors. The main strength of this study lies on a comprehensive bottom-up approach ranging from patch-clamp recordings to behavioral tasks. These tasks mainly address anxiety-like behaviors and so-called depression-like behaviors (sucrose choice, forced swim test, tail suspension test). The results gathered by means of these procedures are clearcut.

    1. Reviewer #1 (Public Review):

      Summary:

      This paper reports an intracranial SEEG study of speech coordination, where participants synchronize their speech output with a virtual partner that is designed to vary its synchronization behavior. This allows the authors to identify electrodes throughout the left hemisphere of the brain that have activity (both power and phase) that correlates with the degree of synchronization behavior. They find that high-frequency activity in the secondary auditory cortex (superior temporal gyrus) is correlated to synchronization, in contrast to primary auditory regions. Furthermore, activity in the inferior frontal gyrus shows a significant phase-amplitude coupling relationship that is interpreted as compensation for deviation from synchronized behavior with the virtual partner.

      Strengths:

      (1) The development of a virtual partner model trained for each individual participant, which can dynamically vary its synchronization to the participant's behavior in real-time, is novel and exciting.

      (2) Understanding real-time temporal coordination for behaviors like speech is a critical and understudied area.

      (3) The use of SEEG provides the spatial and temporal resolution necessary to address the complex dynamics associated with the behavior.

      (4) The paper provides some results that suggest a role for regions like IFG and STG in the dynamic temporal coordination of behavior both within an individual speaker and across speakers performing a coordination task.

      Weaknesses:

      (1) The main weakness of the paper is that the results are presented in a largely descriptive and vague manner. For instance, while the interpretation of predictive coding and error correction is interesting, it is not clear how the experimental design or analyses specifically support such a model, or how they differentiate that model from the alternatives. It's possible that some greater specificity could be achieved by a more detailed examination of this rich dataset, for example by characterizing the specific phase relationships (e.g., positive vs negative lags) in areas that show correlations with synchronization behavior. However, as written, it is difficult to understand what these results tell us about how coordination behavior arises.

      (2) In the results section, there's a general lack of quantification. While some of the statistics reported in the figures are helpful, there are also claims that are stated without any statistical test. For example, in the paragraph starting on line 342, it is claimed that there is an inverse relationship between rho-value and frequency band, "possibly due to the reversed desynchronization/synchronization process in low and high frequency bands". Based on Figure 3, the first part of this statement appears to be true qualitatively, but is not quantified, and is therefore impossible to assess in relation to the second part of the claim. Similarly, the next paragraph on line 348 describes optimal clustering, but statistics of the clustering algorithm and silhouette metric are not provided. More importantly, it's not entirely clear what is being clustered - is the point to identify activity patterns that are similar within/across brain regions? Or to interpret the meaning of the specific patterns? If the latter, this is not explained or explored in the paper.

      (3) Given the design of the stimuli, it would be useful to know more about how coordination relates to specific speech units. The authors focus on the syllabic level, which is understandable. But as far as the results relate to speech planning (an explicit point in the paper), the claims could be strengthened by determining whether the coordination signal (whether error correction or otherwise) is specifically timed to e.g., the consonant vs the vowel. If the mechanism is a phase reset, does it tend to occur on one part of the syllable?

      (4) In the discussion the results are related to a previously-described speech-induced suppression effect. However, it's not clear what the current results have to do with SIS, since the speaker's own voice is present and predictable from the forward model on every trial. Statements such as "Moreover, when the two speech signals come close enough in time, the patient possibly perceives them as its own voice" are highly speculative and apparently not supported by the data.

      (5) There are some seemingly arbitrary decisions made in the design and analysis that, while likely justified, need to be explained. For example, how were the cutoffs for moderate coupling vs phase-shifted coupling (k ~0.09) determined? This is noted as "rather weak" (line 212), but it's not clear where this comes from. Similarly, the ROI-based analyses are only done on regions "recorded in at least 7 patients" - how was this number chosen? How many electrodes total does this correspond to? Is there heterogeneity within each ROI?

    2. Reviewer #2 (Public Review):

      Summary:

      This paper investigates the neural underpinnings of an interactive speech task requiring verbal coordination with another speaker. To achieve this, the authors recorded intracranial brain activity from the left hemisphere in a group of drug-resistant epilepsy patients while they synchronised their speech with a 'virtual partner'. Crucially, the authors were able to manipulate the degree of success of this synchronisation by programming the virtual partner to either actively synchronise or desynchronise their speech with the participant, or else to not vary its speech in response to the participant (making the synchronisation task purely one-way). Using such a paradigm, the authors identified different brain regions that were either more sensitive to the speech of the virtual partner (primary auditory cortex), or more sensitive to the degree of verbal coordination (i.e. synchronisation success) with the virtual partner (secondary auditory cortex and IFG). Such sensitivity was measured by (1) calculating the correlation between the index of verbal coordination and mean power within a range of frequency bands across trials, and (2) calculating the phase-amplitude coupling between the behavioural and brain signals within single trials (using the power of high-frequency neural activity only). Overall, the findings help to elucidate some of the left hemisphere brain areas involved in interactive speaking behaviours, particularly highlighting the high-frequency activity of the IFG as a potential candidate supporting verbal coordination.

      Strengths:

      This study provides the field with a convincing demonstration of how to investigate speaking behaviours in more complex situations that share many features with real-world speaking contexts e.g. simultaneous engagement of speech perception and production processes, the presence of an interlocutor, and the need for inter-speaker coordination. The findings thus go beyond previous work that has typically studied solo speech production in isolation, and represent a significant advance in our understanding of speech as a social and communicative behaviour. It is further an impressive feat to develop a paradigm in which the degree of cooperativity of the synchronisation partner can be so tightly controlled; in this way, this study combines the benefits of using pre-recorded stimuli (namely, the high degree of experimental control) with the benefits of using a live synchronisation partner (allowing the task to be truly two-way interactive, an important criticism of other work using pre-recorded stimuli). A further key strength of the study lies in its employment of stereotactic EEG to measure brain responses with both high temporal and spatial resolution, an ideal method for studying the unfolding relationship between neural processing and this dynamic coordination behaviour.

      Weaknesses:

      One major limitation of the current study is the lack of coverage of the right hemisphere by the implanted electrodes. Of course, electrode location is solely clinically motivated, and so the authors did not have control over this. However, this means that the current study neglects the potentially important role of the right hemisphere in this task. The right hemisphere has previously been proposed to support feedback control for speech (likely a core process engaged by synchronous speech), as opposed to the left hemisphere which has been argued to underlie feedforward control (Tourville & Guenther, 2011). Indeed, a previous fMRI study of synchronous speech reported the engagement of a network of right hemisphere regions, including STG, IPL, IFG, and the temporal pole (Jasmin et al., 2016). Further, the release from speech-induced suppression during a synchronous speech reported by Jasmin et al. was found in the right temporal pole, which may explain the discrepancy with the current finding of reduced leftward high-frequency activity with increasing verbal coordination (suggesting instead increased speech-induced suppression for successful synchronisation). The findings should therefore be interpreted with the caveat that they are limited to the left hemisphere, and are thus likely missing an important aspect of the neural processing underpinning verbal coordination behaviour.

      A further limitation of this study is that its findings are purely correlational in nature; that is, the results tell us how neural activity correlates with behaviour, but not whether it is instrumental in that behaviour. Elucidating the latter would require some form of intervention such as electrode stimulation, to disrupt activity in a brain area and measure the resulting effect on behaviour. Any claims therefore as to the specific role of brain areas in verbal coordination (e.g. the role of the IFG in supporting online coordinative adjustments to achieve synchronisation) are therefore speculative.

    1. Reviewer #1 (Public Review):

      Summary:

      Johnston and Smith used linear electrode arrays to record from small populations of neurons in the superior colliculus (SC) of monkeys performing a memory-guided saccade (MGS) task. Dimensionality reduction (PCA) was used to reveal low-dimensional subspaces of population activity reflecting the slow drift of neuronal signals during the delay period across a recording session (similar to what they reported for parts of the cortex: Cowley et al., 2020). This SC drift was correlated with a similar slow-drift subspace recorded from the prefrontal cortex, and both slow-drift subspaces tended to be associated with changes in arousal (pupil size). These relationships were driven primarily by neurons in superficial layers of the SC, where saccade sensitivity/selectivity is typically reduced. Accordingly, delay-period modulations of both spiking activity and pupil size were independent of saccade-related activity, which was most prevalent in deeper layers of the SC. The authors suggest that these findings provide evidence of a separation of arousal- and motor-related signals. The analysis techniques expand upon the group's previous work and provide useful insight into the power of large-scale neural recordings paired with dimensionality reduction. This is particularly important with the advent of recording technologies which allow for the measurement of spiking activity across hundreds of neurons simultaneously. Together, these results provide a useful framework for comparing how different populations encode signals related to cognition, arousal, and motor output in potentially different subspaces.

      The conclusions drawn by this paper, however, are only partially supported by the data. Additional statistical comparisons and clarifications are needed.

      Comments:

      (1) The authors make fairly strong claims that "arousal-related fluctuations are isolated from neurons in the deep layers of the SC" (emphasis added). This conclusion is based on comparisons between a "slow drift axis", a low-dimensional representation of neuronal drift, and other measures of arousal (Figures 2C, 3) and motor output sensitivity (Figures 2B, 3B). However, the metrics used to compare the slow-drift axis and motor activity were computed during separate task epochs: the delay period (600-1100 ms) and a peri-saccade epoch (25 ms before and after saccade initiation), respectively. As the authors reference, deep-layer SC neurons are typically active only around the time of a saccade. Therefore, it is not clear if the lack of arousal-related modulations reported for deep-layer SC neurons is because those neurons are truly insensitive to those modulations, or if the modulations were not apparent because they were assessed in an epoch in which the neurons were not active. A potentially more valuable comparison would be to calculate a slow-drift axis aligned to saccade onset.

      (2) More generally, arousal-related signals may persist throughout multiple different epochs of the task. It would be worthwhile to determine whether similar "slow-drift" dynamics are observed for baseline, sensory-evoked, and saccade-related activity. Although it may not be possible to examine pupil responses during a saccade, there may be systematic relationships between baseline and evoked responses.

      (3) The relationships between changes in SC activity and pupil size are quite small (Figures 2C & 5C). Although the distribution across sessions (Figure 2C) is greater than chance, they are nearly 1/4 of the size compared to the PFC-SC axis comparisons. Likewise, the distribution of r2 values relating pupil size and spiking activity directly (Figure 5) is quite low. We remain skeptical that these drifts are truly due to arousal and cannot be accounted for by other factors. For example, does the relationship persist if accounting for a very simple, monotonic (e.g., linear) drift in pupil size and overall firing rate over the course of an individual session?

      (4) It is not clear how the final analysis (Figure 6) contributes to the authors' conclusions. The authors perform PCA on: (i) residual spiking responses during the delay period binned according to pupil size, and (ii) spiking responses in the saccade epoch binned according to target location (i.e., the saccade tuning curve). The corresponding PCs are the spike-pupil axis and the saccade tuning axis, respectively. Unsurprisingly, the spike-pupil axis that captures variance associated with arousal (and removes variance associated with saccade direction) was not correlated with a saccade-tuning axis that captures variance associated with saccade direction and omits arousal. Had these measures been related it would imply a unique association between a neuron's preferred saccade direction and pupil control- which seems unlikely. The separation of these axes thus seems trivial and does not provide evidence of a "mechanism...in the SC to prevent arousal-related signals interfering with the motor output." It remains unknown whether, for example, arousal-related signals may impact trial-by-trial changes in neuronal gain near the time of a saccade, or alter saccade dynamics such as acceleration, precision, and reaction time.

    2. Reviewer #2 (Public Review):

      Summary:

      Neurons in motor-related areas have increasingly been shown to carry also other, non-motoric signals. This creates a problem of avoidance of interference between the motor and non-motor-related signals. This is a significant problem that likely affects many brain areas. The specific example studied here is interference between saccade-related activity and slow-changing arousal signals in the superior colliculus. The authors identify neuronal activity related to saccades and arousal. Identifying saccade-related activity is straightforward, but arousal-related activity is harder to identify. The authors first identify a potential neuronal correlate of arousal using PCA to identify a component in the population activity corresponding to slow drift over the recording session. Next, they link this component to arousal by showing that the component is present across different brain areas (SC and PFC), and that it is correlated with pupil size, an external marker of arousal. Having identified an arousal-related component in SC, the authors show next that SC neurons with strong motor-related activity are less strongly affected by this arousal component (both SC and PFC). Lastly, they show that SC population activity patterns related to saccades and pupil size form orthogonal subspaces in the SC population.

      Strengths:

      A great strength of this research is the clear description of the problem, its relationship with the performed analysis, and the interpretation of the results. the paper is very well written and easy to follow.

      An additional strength is the use of fairly sophisticated analysis using population activity.

      Weaknesses:

      (1) The greatest weakness in the present research is the fact that arousal is a functionally less important non-motoric variable. The authors themselves introduce the problem with a discussion of attention, which is without any doubt the most important cognitive process that needs to be functionally isolated from oculomotor processes. Given this introduction, one cannot help but wonder, why the authors did not design an experiment, in which spatial attention and oculomotor control are differentiated. Absent such an experiment, the authors should spend more time explaining the importance of arousal and how it could interfere with oculomotor behavior.

      (2) In this context, it is particularly puzzling that one actually would expect effects of arousal on oculomotor behavior. Specifically, saccade reaction time, accuracy, and speed could be influenced by arousal. The authors should include an analysis of such effects. They should also discuss the absence or presence of such effects and how they affect their other results.

      (3) The authors use the analysis shown in Figure 6D to argue that across recording sessions the activity components capturing variance in pupil size and saccade tuning are uncorrelated. however, the distribution (green) seems to be non-uniform with a peak at very low and very high correlation specifically. The authors should test if such an interpretation is correct. If yes, where are the low and high correlations respectively? Are there potentially two functional areas in SC?

    3. Reviewer #3 (Public Review):

      Summary:

      This study looked at slow changes in neuronal activity (on the order of minutes to hours) in the superior colliculus (SC) and prefrontal cortex (PFC) of two monkeys. They found that SC activity shows slow drift in neuronal activity like in the cortex. They then computed a motor index in SC neurons. By definition, this index is low if the neuron has stronger visual responses than motor responses, and it is low if the neuron has weaker visual responses and stronger motor responses. The authors found that the slow drift in neuronal activity was more prevalent in the low motor index SC neurons and less prevalent in the high motor index neurons. In addition, the authors measured pupil diameter and found it to correlate with slow drifts in neuronal activity, but only in the neurons with lower motor index of the SC. They concluded that arousal signals affecting slow drifts in neuronal modulations are brain-wide. They also concluded that these signals are not present in the deepest SC layers, and they interpreted this to mean that this minimizes the impact of arousal on unwanted eye movements.

      Strengths:

      The paper is clear and well-written.

      Showing slow drifts in the SC activity is important to demonstrate that cortical slow drifts could be brain-wide.

      Weaknesses:

      However, I am concerned about two main points:

      First, the authors repeatedly say that the "output" layers of the SC are the ones with the highest motor indices. This might not necessarily be accurate. For example, current thresholds for evoking saccades are lowest in the intermediate layers, and Mohler & Wurtz 1972 suggested that the output of the SC might be in the intermediate layers. Also, even if it were true that the high motor index neurons are the output, they are very few in the authors' data (this is also true in a lot of other labs, where it is less likely to see purely motor neurons in the SC). So, this makes one wonder if the electrode channels were simply too deep and already out of the SC? In other words, it seems important to show distributions of encountered neurons (regardless of the motor index) across depth, in order to better know how to interpret the tails of the distributions in the motor index histogram and in the other panels of Figure Supplement 1. I elaborate more on these points in the detailed comments below.

      Second, the authors find that the SC cells with a low motor index are modulated by pupil diameter. However, this could be completely independent of an "arousal signal". These cells have substantial visual responses. If the pupil diameter changes, then their activity should be influenced since the monkey is watching a luminous display. So, in this regard, the fact that they do not see "an arousal signal" in most motor neurons (through the pupil diameter analyses) is not evidence that the arousal signal is filtered out from the motor neurons. It could simply be that these neurons simply do not get affected by the pupil diameter because they do not have visual sensitivity. So, even with the pupil data, it is still a bit tricky for me to interpret that arousal signals are excluded from the "output layers" of the SC.

      I think that a remedy to the first point above is to change the text to make it a bit more descriptive and less interpretive. For example, just say that the slow drifts were less evident among the neurons with high motor index.

      For the second point, I think that it is important to consider the alternative caveat of different amounts of light entering the system. Changes in light level caused by pupil diameter variations can be quite large.

    1. eLife assessment

      This study shows that retinal bipolar cell subtype-specific differences in the size of synaptic ribbon-associated vesicle pools contribute to the transient versus sustained kinetics of the responses of retinal ganglion cells. The findings are important and the data is extensive and solid, however, there is also the possibility that glutamate release could be modulated by the kinetics of presynaptic inhibition at bipolar cell terminals and this may contribute to mediating the transient and/or sustained kinetics of glutamate release. This work will be of broad interest to researchers working on synaptic transmission, retinal signal processing, and sensory neurobiology.

    2. Reviewer #1 (Public Review):

      Summary:

      In the retina, parallel processing of cone photoreceptor output under bright light conditions dissects critical features of our visual environment and is fundamental to visual function. Cone photoreceptor signals are sampled by several types of bipolar cells and passed onto the ganglion cells. At the output of retinal processing, retinal ganglion cells send about 40 different codes of the visual scene to the brain for further processing. In this study, the authors focus on whether subtype-specific differences in the size of synaptic ribbon-associated vesicle pools of bipolar cells contribute to different retinal ganglion cell (RGC) responses. Specifically, inputs to ON alpha RGCs producing transient versus sustained kinetics (ON-S vs. ON-T, respectively) are compared. The authors first demonstrate that ON-S vs. ON-T RGCs are readily identifiable in a whole mount preparation and respond differently to both static and to a spatially uniform, randomly fluctuating (Gaussian noise) light stimulus. Liner-nonlinear (LN) models were used to estimate the transformation between visual input and excitatory synaptic input for each RGCs; these models suggested the presence of transient versus sustained kinetics already in the excitatory inputs to ON-T and ON-S RGCs. Indeed, the authors show that (glutamatergic) excitatory inputs to ON-S vs. ON-T RGCs are of distinct kinetics. The subtypes of bipolar cells providing input to ON-S are known (i.e., type 6 and 7), but the source of excitatory bipolar inputs to ON-T RGCs needed to be determined. In a tedious process, it is elegantly shown here that ON-T RGCs receive most of their excitatory inputs from type 5 and 6 bipolars. Interestingly, the temporal properties of light-evoked responses of type 5, 6, and 7 bipolars recorded from the somas were indistinguishable and rather sustained, suggesting that the origin of transient kinetics of excitatory inputs to ON-T RGCs suggested by the LN model might be found in the processing of visual signals at the bipolar cell axon terminal. Blocking GABA- or glycinergic inhibitory inputs did not alter the light-evoked excitatory input kinetics to ON-T and ON-S RGCs. Two-photon glutamate sensor imaging revealed significantly faster kinetics of light-evoked glutamate signals at ON-T versus ON-S RGCs. Detailed EM analysis of bipolar cell ribbon synapses onto ON-T and ON-S RGCs revealed fewer ribbon-associated vesicles at ON-T synapses, which is consistent with stronger paired-flash depression of light-evoked excitatory currents in ON-T RGCS versus ON-S RGCs. This study suggests that bipolar subtype-specific differences in the size of synaptic ribbon-associated vesicle pools contribute to transient versus sustained kinetics in RGCs.

      Strengths:

      The use of multiple, state-of-the-art tools and approaches to address the kinetics of bipolar to ganglion cell synapse in an identified circuit.

      Weaknesses:

      For the most part, the data in the paper support the conclusions, and the authors were careful to try to address questions in multiple ways. Two-photon glutamate sensor imaging experiment showing that blocking GABA- and glycinergic inhibition does not change the kinetics of light-evoked glutamate signals at ON-T RGCs would strengthen the conclusion that bipolar subtype-specific differences in the size of synaptic ribbon-associated vesicle pools contribute to transient versus sustained kinetics in RGCs.

    3. Reviewer #2 (Public Review):

      Summary:

      Goal of the study. The authors tried to pinpoint the origins of transient and sustained responses measured at retinal ganglion cells (rgcs), which is the output layer of the retina. Response characteristics of rgcs are used to group them into different types. The diversity of rgc types represents the ability of the retina to transform visual inputs into distinct output channels. They find that the physical dimensions of bipolar cell's synaptic ribbons (specialized release sites/active zones) vary across the different types of cone on-bpcs, in ways that they argue could facilitate transient or sustained release. This diversity of release output is what they argue underlies the differences in on-rgcs response characteristics, and ultimately represents a mechanism for creating parallel cone-driven channels.

      Strengths:

      The major strengths of the study are the anatomical approaches employed and the use of the "glutamate sniffer" to assay synaptic glutamate levels. The outline of the study is elegant and reflects the strengths of the authors.

      Weaknesses:

      The major weakness is that the ambitious outline is not matched with a complete set of results, and the set of physiological protocols is disjointed, not sufficient to bridge the systems-level question with the presynaptic release question.

      Major comments on the results and suggestions.

      The ribbon model of release has been explored for decades and needs to be further adapted to systems-level work. The study under consideration by Kuo et al. takes on this task. Unfortunately, the experimental design does not permit a level of control over presynaptic/bpc behavior that is comparable to earlier studies, nor do they manipulate release in ways that test the ribbon model (i.e., paired recordings or Ribeye-ko). Furthermore, the data needs additional evaluation, and the presentation and interpretations should draw on published biophysical and molecular studies.

      To build a ribbon-centric context, consider recent literature that supports the assertion that ribbons play a role in forming AZ release sites and facilitating exocytosis. Reference Ribeye-ko studies. For example, ribbonless bpcs show an 80% reduction in release (Maxeiner et al EMBO J 2016), the ribbonless retina exhibits signaling deficits at the output layer (Okawa et al ...Rieke, ..Wong Nat Comm 2019), and ribbonless rods show an 80% reduction the readily releasable pool (RRP) of SVs (Grabner Moser, elife 2021). In addition, the authors could refer to whole-cell membrane capacitance studies on mammalian rods, cones, and bpcs, because the size of the RRP of SVs scales with the dimensions and numbers of ribbons (total ribbon footprint). For comparison, bipolars see the review by Wan and Heidelberger 2011. For a comparison of mammalian rods and cones, see, rods: Grabner and Moser (2021 eLife), Mueller.. Regus Leidig et al. (2019; J Neurosci) and cones Grabner ...DeVries (Nat Comm 2023). A comparison of cell types shows that the extent of release is (1) proportional to the total size of the ribbon footprint, and (2) less release is witnessed when ribbons are deleted (also see photo ablation studies by Snellman.... And Mehta..Zenisek, Nat Neurosci and Neuron).

      Ribbon morphology may change in an activity-dependent manner. The rod ribbon AZ has been reported to lengthen in the dark (Dembla et al 2020), and deletion of the ribbon shortens the length of the AZ (defined by Cav1,4 or RIM2); in addition, the Ribeye-ko AZs fail to change in size with light and dark conditioning. Furthermore, EM studies on rod and cone AZs in light and dark argue that the number of SVs at the base of the ribbon increases in the dark, when PRs are depolarized (see Figure 10, Babai et al 2016 JNeurosci). Lastly, using goldfish Mb1 on-bipolars, Hull et al (2006, J Neurophysio) correlated an increase in release efficiency with an increase in ribbon numbers, which accompanied daylight. >> When release activity is high, ribbon AZ length increases (Dembla, rods), the number of docked SVs increases (Babai, rods cones), and the number of ribbons increases (Hull, diurnal Mb1s).

      The results under review, Kuo et al., were attained with SBF-SEM, which has the benefit of addressing large-volume questions as required here, yet it achieves lower spatial resolution than what is attained with TEM tomography and FIB-EM. Ideally, the EM description would include SV size, and the density of ribbon-tethered SVs that are docked at the plasma membrane, because this is where the SVs fuse (additional non-ribbon release sites may also exist? Mehta ... Singer 2014 J Neurosci). Studies by Graydon et al 2011 and 2014 (both in J Neurosci), and Jean ... Moser et al 2018 (eLife) are good examples of quantitative estimates of SVs docking sites at ribbons. SBF-SEM does not allow for an assessment of SVs within 5 nm of the PM, but if the authors can identify the number of SVs that appear within the limit of resolution (10 to 15 nm) from the PM, then this data would be useful. Also, what dimension(s) of the large ribbons make them larger? Typically, ribbons are fixed in height (at least in the outer retina, 200 to 250 nm), but their length varies and the number ribbons per terminal varies. Is the larger ribbon size observed in type 6 bpcs do to longer ribbons, or taller ribbons? A longer ribbon likely has more docked SVs. An additional possibility is that more SVs are about the ribbon-PM footprint, either more densely packed and/or expanding laterally (see definitions in Jean....Moser, elife 2018).

      The ribbon literature given above makes the argument that ribbons increase exocytotic output, and morphological studies suggest that release activity enhances 1) ribbon length (Dembla) and 2) the density of SVs near the PM (Babai). These findings could lead one to propose that type 6 bpcs (inputs to On-sustained) are more active than type 5i (feed into On-transient). Here Kuo et al. show that the bpcs have similar Vm (measured from the soma) in response to light stimulation. Does Vm predict release? Not entirely as the authors acknowledge, because: Cav channel properties, SV availability, and negative feedback are all downstream of bpc Vm. The only experiment performed to test downstream factors focused on negative feedback from amacrines. The data presented in Figures 5C-F led me to conclude the opposite of what the authors concluded. My impression is that the T-ON rgc exhibits strong disinhibition when GABA-blockers are applied (the initial phase is greatly increased in amplitude and broadened with the drug), which contrasts with the S-On rgc responses that show a change in the amplitude of the initial phase but not its width (taus would be nice). Here and in many places the authors refer to changes in release kinetics, without implementing a useful description of kinetics. For instance, take the cumulative current (charge) in Figure 5C and fit the control and drug traces to arrive at taus, and their respective amplitudes, and use these values to describe kinetic phases. One final point, the summary in Figure 5D has a p: 0.06, very close to the cutoff for significance, which begs for more than an n = 5. Given that previous studies have shown that bpc output is shaped by immediate msec GABA feedback, in ways that influence kinetic phases of release (..Mb1 bipolars, see Vigh et al 2005 Neuron), more attention to this matter is needed before the authors rule out feedback inhibition in favor of ribbon size. If by chance, type 5i bpcs are under uniquely strong feedback inhibition, then ribbon size may result from less activity, not less output resulting from smaller ribbons.

      As mentioned above, the behavior of Cav channels is important here. This is difficult to address with voltage clamps from the soma, especially in the Vm range relevant to this study. Given that it has previously been modeled that the rod bpc to AII pathway adapts to prolonged depolarization of rbcs through downregulating Cav channel-mediated Ca2+ influx (Grimes ....Rieke 2014 Neuron), it seems important for Kou et al to test if there is a difference in Cav regulation between type 6 and 5i bpcs. Ca2+ imaging with a GCaMP strategy (Baden....Lagnado Current Biology, 2011) or filling the presynapse with Ca dyes (see inner hair cells: Ozcete and Moser, EMBO J 2020) would allow for the correlation of [Ca]intra with GluSnf signals (both local readouts).

      Stimulation protocol and presentation of Glutamate Sniffer data in Figure 6. In all of your figures where you state steady st as a % of pk amplitude, please indicate in the figure where you estimate steady state. Alternatively, if you take the cumulative dF/F signal, then you can fit the different kinetic phases. From the appearance of the data, the Sustained Glu signals look like square waves (Figure 6B ROI1-4), without a transient at onset, which is not predicted in your ribbon model that assumes different kinetic phases (1. depletion of docked SVs, and 2. refilling and repriming). The Transient responses (Figure 6B ROI5-8) are transient and more compatible with a depressing ribbon scheme. If you take the cumulative, for all of the On-S and compare it to all of the On-T responses, my guess is the cumulative dF/F will be 10 to 20 larger for the S-On. Would you conclude that bpc inputs to On-S (type 6) release 20-fold more SVs per 4 seconds on a per ribbon basis, and does the surface area of the type 6 bpcs account for this difference? From Figures 8B and D, the volume of the ribbon is ~2 fold greater for type 6 vs 5i, but the Surface Area (both faces of ribbon) is more relevant to your model that claims ribbon size is the pivotal factor. If making cumulative traces, and comparisons on an absolute scale is unfounded, then we need to know how to compare different observations. The classic ribbon models always have a conversion factor such as the capacitance of an SV or q size that is used to derive SV numbers from total dCm or Qcontent. See Kim ....et al von Gersdorff, 2023, Cell Reports. Why not use the Gaussian noise stimulus in Fig 6 as in Figure 1 and 2?

      Figure 7. What is the recovery time for mammalian cones derived from ribbon-based models? There are estimates from membrane capacitance studies. Ground squirrel cones take 0.7 to 1 sec to recover the ultrafast, primed pool of SVs when probed with a paired-pulse protocol (Grabner ...DeVries 2016, Neuron). Their off-bpcs take anywhere from under 0.2 sec to a second to recover, which is a combination of many synaptic factors (Grabner ...DeVries Nat Comm 2023). Rod On bpcs take over a second (Singer Diamond 2006, reviewed Wan and Heidelberger 2011). In Figure 7B, the recovery time is ~150 ms for the responses measured at rgcs. This brief recovery time is incompatible with existing ribbon models of release. Whole-cell membrane capacitance measurements would be helpful here.

      Experimental Suggestion: Add GABA blockers and see if type 5i bpc responds with more release (GluSniff) and prolonged [Ca2+] intra (GCaMP). Compare this to type 6 bpc behavior with GABA/gly blockers. This will rule in or out whether feedback inhibition is involved.

    4. Reviewer #3 (Public Review):

      Summary:

      Different types of retinal ganglion cell (RGC) have different temporal properties - most prominently a distinction between sustained vs. transient responses to contrast. This has been well established in multiple species, including mice. In general, RGCs with dendrites that stratify close to the ganglion cell layer (GCL) are sustained; whereas those that stratify near the middle of the inner plexiform layer (IPL) are transient. This difference in RGC spiking responses aligns with similar differences in excitatory synaptic currents as well as with differences in glutamate release in the respective layers - shown previously and here, with a glutamate sensor (iGluSnFR) expressed in the RGCs of interest. Differences in glutamate release were not explained by differences in the distinct presynaptic bipolar cells' voltage responses, which were quite similar to one another. Rather, the difference in transient vs. sustained responses seems to emerge at the bipolar cell axon terminals in the form of glutamate release. This difference in the temporal pattern of glutamate release was correlated with differences in the size of synaptic ribbons (larger in the bipolar cells with more sustained responses), which also correlated with a greater number of vesicles in the vicinity of the larger ribbons.

      The main conclusion of the study relates to a correlation (because it is difficult to manipulate ribbon size or vesicle density experimentally): the bipolar cells with increased ribbon size/vesicle number would have a greater possibility of sustained release, which would be reflected in the postsynaptic RGC synaptic currents and RGC firing rates. This model proposes a mechanism for temporal channels that is independent of synaptic inhibition. Indeed, some experiments in the paper suggest that inhibition cannot explain the transient nature of glutamate release onto one of the RGC types. Still, it is surprising that such a diverse set of inhibitory interneurons in the retina would not play some role in diversifying the temporal properties of RGC responses.

      Strengths:

      (1) The study uses a systematic approach to evaluating temporal properties of retinal ganglion cell (RGC) spiking outputs, excitatory synaptic inputs, presynaptic voltage responses, and presynaptic glutamate release. The combination of these experiments demonstrates an important step in the conversion from voltage to glutamate release in shaping response dynamics in RGCs.

      (2) The study uses a combination of electrophysiology, two-photon imaging, and scanning block-face EM to build a quantitative and coherent story about specific retinal circuits and their functional properties.

      Weaknesses:

      (1) There were some interesting aspects of the study that were not completely resolved, and resolving some of these issues may go beyond the current study. For example, it was interesting that different extracellular media (Ames medium vs. ACSF) generated different degrees of transient vs. sustained responses in RGCs, but it was unclear how these media might have impacted ion channels at different levels of the circuit that could explain the effects on temporal tuning.

      (2) It was surprising that inhibition played such a small role in generating temporal tuning. At the same time, there were some gaps in the investigation of inhibition (e.g., IPSCs were not measured in either of the RGC types; pharmacology was used to investigate responses only in the transient RGCs).

      (3) There could be additional discussion and references to the literature describing several topics, including: temporal dynamics of glutamate release at different levels of the IPL; previous evidence that release sites from a single presynaptic neuron can differ in their temporal properties depending on the postsynaptic target; previous investigations of the role of inhibition in temporal tuning within retinal circuitry.

    1. eLife assessment

      This important study examined the dynamics of attentional reorientation in working memory by assessing alpha-band lateralization in EEG recordings and saccade bias and provides convincing evidence for a second stage of internal attentional deployment during WM. This work provides novel insights into the dynamic mechanism in WM and will be of broad interest and impact to cognitive neuroscience, including attention and working memory. Performing additional analysis to disentangle the roles of saccade and micro-saccade and to show behavioral relevance would further strengthen the conclusion.

    2. Reviewer #1 (Public Review):

      In the study "Re-focusing visual working memory during expected and unexpected memory tests" by Sisi Wang and Freek van Ede, the authors investigate the dynamics of attentional re-orienting within visual working memory (VWM). Utilizing a robust combination of behavioral measures, electroencephalography (EEG), and eye tracking, the research presents a compelling exploration of how attention is redirected within VWM under varying conditions. The research question addresses a significant gap in our understanding of cognitive processes, particularly how expected and unexpected memory tests influence the focus and re-focus of attention. The experimental design is meticulously crafted, enabling a thorough investigation of these dynamics. The figures presented are clear and effectively illustrate the findings, while the writing is concise and accessible, making the complex concepts understandable. Overall, this study provides valuable insights into the mechanisms of visual working memory and attentional re-orienting, contributing meaningfully to the field of cognitive neuroscience. Despite the strengths of the manuscript, there are several areas where improvements could be made.

      Microsaccades or Saccades?

      In the manuscript, the terms "microsaccades" and "saccades" are used interchangeably. For instance, "microsaccades" are mentioned in the keywords, whereas "saccades" appear in the results section. It is crucial to differentiate between these two concepts. Saccades are large, often deliberate eye movements used for scanning and shifting attention, while microsaccades are small, involuntary movements that maintain visual perception during fixation. The authors note the connection between microsaccades and attention, but it is not well-recognized that saccades are directly linked to attention. Despite the paradigm involving a fixation point, it remains unclear whether large eye movements (saccades) were removed from the analysis. The authors mention the relationship between microsaccades and attention but do not clarify whether large eye movements (saccades) were excluded from the analysis. If large eye movements were removed during data processing, this should be documented in the manuscript, including clear definitions of "microsaccades" and "saccades." If such trials were not removed, the contribution of large eye movements to the results should be shown, and an explanation provided as to why they should be considered.

      Alpha Lateralization in Attentional Re-orienting

      In the attentional orienting section of the results (Figure 2), the authors effectively present EEG alpha lateralization results with time-frequency plots and topographic maps. However, in the attentional re-orienting section (Figure 3), these visualizations are absent. It is important to note that the time period in attentional orienting differs from attentional re-orienting, and consequently, the time-frequency plots and topographic maps may also differ. Therefore, it may be invalid to compute alpha lateralization without a clear alpha activity difference. The authors should consider including time-frequency plots and topographic maps for the attentional re-orienting period to validate their findings.

      Onset and Offset Latency of Saccade Bias

      The use of the 50% peak to determine the onset and offset latency of the saccade bias is problematic. For example, if one condition has a higher peak amplitude than another, the standard for saccade bias onset would be higher, making the observed differences between the onset/offset latencies potentially driven by amplitude rather than the latencies themselves. The authors should consider a more robust method for determining saccade bias onset and offset that accounts for these amplitude differences.

      Control Analysis for Trials Not Using the Initial Cue

      The control analysis for trials where participants did not use the initial cue raises several questions:

      (1) The authors claim that "unlike continuous alpha activity, saccades are events that can be classified on a single-trial level." However, alpha activity can also be analyzed at the single-trial level, as demonstrated by studies like "Alpha Oscillations in the Human Brain Implement Distractor Suppression Independent of Target Selection" by Wöstmann et al. (2019). If single-trial alpha activity can be used, it should be included in additional control analyses.

      (2) The authors aimed to test whether the re-orienting signal observed after the test is not driven exclusively by trials where participants did not use the initial cue. They hypothesized that "in such a scenario, we should only observe attention deployment after the test stimulus in trials in which participants did not use the preceding retro cue." However, if the saccade bias is the index for attentional deployment, the authors should conduct a statistical test for significant saccade bias rather than only comparing toward-saccade after-cue trials with no-toward-saccade after-cue trials. The null results between the two conditions do not immediately suggest that there is attention deployment in both conditions.

      (3) Even if attention deployment occurs in both conditions, the prolonged re-orienting effect could also be caused by trials where participants did not use the initial cue. Unexpected trials usually involve larger and longer brain activity. The authors should perform the same analysis on the time after the removal of trials without toward-saccade after the cue to address this potential confound.

    3. Reviewer #2 (Public Review):

      Summary:

      This study utilized EEG-alpha activity and saccade bias to quantify the spatial allocation of attention during a working memory task. The findings indicate a second stage of internal attentional deployment following the appearance of a memory test, revealing distinct patterns between expected and unexpected test trials. The spatial bias observed during the expected test suggests a memory verification process, whereas the prolonged spatial bias during the unexpected test suggests a re-orienting response to the memory test. This work offers novel insights into the dynamics of attentional deployment, particularly in terms of orienting and re-orienting following both the cue and memory test.

      Strengths:

      The inclusion of both EEG-alpha activity and saccade bias yields consistent results in quantifying the attentional orienting and re-orienting processes. The data clearly delineate the dynamics of spatial attentional shifts in working memory. The findings of a second stage of attentional re-orienting may enhance our understanding of how memorized information is retrieved.

      Weaknesses:

      Although analyses of neural signatures and saccade bias provided clear evidence regarding the dynamics of spatial attention, the link between these signatures and behavioral performance remains unclear. Given the novelty of this study in proposing a second stage of 'verification' of memory contents, it would be more informative to present evidence demonstrating how this verification process enhances memory performance.

    4. Reviewer #3 (Public Review):

      Summary:

      Wang and van Ede investigate whether and how attention re-orients within visual working memory following expected and unexpected centrally presented memory tests. Using a combination of spatial modulations in neural activity (EEG-alpha lateralization) and gaze bias quantified as time courses of microsaccade rate, the authors examined how retro cues with varying levels of reliability influence attentional deployment and subsequent memory performance. The conclusion is that attentional re-orienting occurs within visual working memory, even when tested centrally, with distinct patterns following expected and unexpected tests. The findings provide new value for the field and are likely of broad interest and impact, by highlighting working memory as an action-bound process (in)dependent on (an ambiguous) past.

      Strengths:

      The study uniquely integrates behavioral data (accuracy and reaction time), EEG-alpha activity, and gaze tracking to provide a comprehensive analysis of attentional re-orienting within visual working memory. As typical for this research group, the validity of the findings follows from the task design that effectively manipulates the reliability of retro cues and isolates attentional processes related to memory tests. The use of well-established markers for spatial attention (i.e. alpha lateralization) and more recently entangled dependent variable (gaze bias) is commendable. Utilizing these dependent metrics, the concise report presents a thorough analysis of the scaling effects of cue reliability on attentional deployment, both at the behavioral and neural levels. The clear demonstration of prolonged attentional deployment following unexpected memory tests is particularly noteworthy, although there are no significant time clusters per definition as time isn't a factor in a statistical sense, the jackknife approach is convincing. Overall, the evidence is compelling allowing the conclusion of a second stage of internal attentional deployment following both expected and unexpected memory tests, highlighting the importance of memory verification and re-orienting processes.

      Weaknesses:

      I want to stress upfront that these weaknesses are not specific to the presented work and do not affect my recommendation of the paper in its present form.

      The sample size is consistent with previous studies, a larger sample could enhance the generalizability and robustness of the findings. The authors acknowledge high noise levels in EEG-alpha activity, which may affect the reliability of this marker. This is a general issue in non-invasive electrophysiology that cannot be handled by the authors but an interested reader should be aware of it. Effectively, the sensitivity of the gaze analysis appears "better" in part due to the better SNR. The latter also sets the boundaries for single-tiral analyses as the authors correctly mention. In terms of generalizability, I am convinced that the main outcome will likely generalize to different samples and stimulus types. Yet, as typical for the field future research could explore different contexts and task demands to validate and extend the findings. The authors provide here how and why (including sharing of data and code).

    1. eLife assessment

      This valuable study investigates the contribution of far-red light photo-acclimated cyanobacteria to primary production in intertidal beachrock habitats. Though the study presents solid evidence, the text would benefit from an improved discussion section and some additional methodological details.

    2. Reviewer #1 (Public Review):

      Summary:

      Mosshammer et al. studied the oxygenic photosynthetic productivity of beachrock samples containing cyanobacteria with different pigment compositions. The use of longer wavelength absorbing chlorophylls in some cyanobacteria (chlorophylls d and f) allows their photosystems to use light further in the red than canonical chlorophyll a photosystems. As such, their distribution in visible light-shaded environments, such as the beachrock studied by Mosshammer et al., allows them to perform oxygenic photosynthesis using wavelengths not capable of driving photosynthesis in most cyanobacteria, algae, or plants.

      By adapting measuring systems they have previously used to study these types of beachrock samples, the authors attempt to mimic a more natural light penetration through the beachrock in order to measure oxygen production. By doing so with different wavelengths and intensities, the authors are able to show that far-red light-driven oxygen production is potentially capable of driving high levels of gross primary production.

      Strengths:

      The manuscript builds on previous measurement techniques used by the authors while focussing on illumination from the top of a sample rather than the specific microbial layers themselves. This provides a more environmentally realistic understanding of the beachrock community, as well as far-red light-driven photosynthesis.

      The manuscript benefits from using previously defined methods to further characterize complex environmental samples.

      Weaknesses:

      The manuscript suffers from a lack of discussion and interpretation of the findings, and as such is more of a report.

      Using the envionmental beachrock samples has inherent complications, from the variation in rock morphology, to the microbial community composition of different samples as well as within a single sample. It would benefit the authors to discuss these technical difficulties in more detail, as the light penetration through the beachrock is likely greatly limiting measurements of chlorophyll f and/or chlorophyll d-driven photosynthesis in the beachrock.

      This can be seen in the different luminescence measurements (Figure 2 and supplements), that the different samples have clear differences in far-red light-driven oxygen production. While the BLACK sample produces oxygen with 740nm LED filtered with a NIR-75N filter, neither of the other two samples produce measureable oxygen under this condition. Conversely, this sample results in the lowest level of gross photosynthesis when measuring dissolved oxygen. A more detailed discussion of the variation between and within samples and measurements would benefit the overall results of the manuscript.

      The PINK beachrock sample has the highest level of chlorophyll d per chlorophyll a. As FaRLiP cyanobacteria only incorporate 1 chlorophyll d per photosystem II, and none in photosytem I, is there a (relatively) high composition of Acaryochloris species in the PINK sample? If normalized to the reflectance minima can more distinct populations be identified?

      For Figure 1, multiple points should be clarified. The first is that the HPLC methods are estimates of concentrations, as the extinction coefficients are not correct for the solvent solution for which the pigments elute, and are likely to be differently incorrect for each pigment. This results in quantitatively incorrect data, but qualitative comparisons between samples likely remain valid. Secondly, the pigment concentrations can also be misleading. Within the cyanobacterial cells, photosystem I harbors approximately 3 times as many chlorophylls as photosystem II. While the community numbers and photosystem stoichiometry are not necessarily relevant to the current study, the red shift in absorbance between photosystem II and photosystem I is of importance for the measurements performed. How cyanobacterial cells with differing concentrations of photosystems will absorb the red tail of the far-red LEDs, as well as impact the light penetration would be a useful discussion point.

      The different samples used are from varying beachrock zonations but have the same chlorophyll f per chlorophyll a concentrations. A discussion of why this might be would be useful.

      For the luminescence measurements (Figure 2 and supplements), no oxygen production is seen in the BROWN or PINK beachrock samples when the 740nm LED is filtered with a NIR-75N filter. This is likely due to multiple factors (low initial intensity compounded by penetration depth, community composition, etc.) but should be discussed. While the authors say that Chrooccidiopsis species dominate the samples, variation of absorbance between different chlorophyll f containing cyanobacteria has also been measured (see Tros et al. 2021, Chem), and the extent to which even chlorophyll f species extend into the far-red varies. Discussions about these implications would help with their characterization of the luminescence data. While the authors discuss that based on their respiration measurements the oxygen may be being consumed, resulting in an inability to measure it (lines 147-150), other explanations are clearly viable.

      For the luminescence measurements, no oxygen production is discernable in the endolithic region when excited with visible light, which is at a much stronger intensity than the near-infrared light used. However, both Acaryochloris and chlorophyll f cyanobacteria are capable of driving photosynthesis with visible light. As the intensities used are much brighter than for the NIR measurements, presumably generated oxygen would be higher than what could be immediately consumed by respiration. It is important that the authors address this.

      A highlighted point by the authors is the >20% of photosynthesis driven by NIR in the beachrock at comparable irradiation. However, this statement is deceiving for multiple reasons.<br /> (1) The irradiation is likely not comparable for what is reaching the cells. This is not a problem per se as illumination from above is the point, but does skew the interpretation.<br /> (2) The >20% value comes from the maximum amount of gross photosynthesis driven by NIR at ~1400 umol photons m-2s-1, whereas at other comparable illuminations the value is much, much lower (<1%). A likely interpretation of such data is that while the chlorophyll f endolithic layer is capable of producing a relatively large amount of oxygen, it is likely far less productive under most illuminations, though not zero.

      The authors have the difficult task of weaving in results from laboratory, uniculture or isolated photosystem measurements with their environmental-based results. This is especially clear in lines 172-183. While the authors are correct that measurements of trapping times in chlorophyll f containing photosystems have been measured and are slower in chlorophyll f photosystem II and photosystem I relative to all chlorophyll a photosystems, the quantum yield for trapping remains high in chlorophyll f photosystem I (Tros et al. 2021, Chem). The quantum yield of trapping for chlorophyll f photosystem II is much lower for chlorophyll f than chlorophyll a complex, though improved by the attachment of phycobilisomes. However, these are intrinsic physical properties of the complexes that are not modulated in response to the environments. This could be interpreted that at low photon flux densities as measured in these experiments, the endolithic near infrared-driven oxygen production could be limited by an overall lower quantum efficiency of trapping the captured light and thus minimizing photosynthetic productivity relative to a theoretical level based on the efficiency of the chlorophyll a photosystem II. How the variations in intensity and spectral composition impact the cyanobacterial community likely involves many other factors and has not been addressed (though see Nurnberg et al. 2018, Science and Viola et al. 2022 eLife for further discussions).

    3. Reviewer #2 (Public Review):

      The authors investigate the role of near-infrared photosynthesis in primary production across three beachrock communities. This work is particularly pertinent as more cyanobacteria with far-red light acclimation capacities are discovered, underscoring the need to assess their contributions to primary production. However, the manuscript is currently very difficult to follow due to unclear correlations between the text and figures and the samples analyzed in the different experiments.. Additional explanations would also enhance clarity. For example, it would be beneficial for the authors to better define the three communities, as distinctions are not apparent. Another example is the pigment analysis, where the extinction coefficients for pigments vary in different solvents. Quantification by chromatography should use calibration curves for all pigments, not just Chl a, as is currently done. Pigments can be easily purified from cyanobacteria for this purpose.

    4. Reviewer #3 (Public Review):

      Summary:

      On islands in the pacific, beachrock occurs near high tide level, composed of calcareous material. The surface of the beach rock is colonised by cyanobacteria and some eukaryotic algae. On Heron Island on the Southern Great Barrier Reef, beach rock occurs on the north and south side of the island in continuous slabs, which slope gently upwards toward the island. Thus the upper beach rock is only inundated at extreme high tides. On the south side, the major photosynthetic organism is a cyanobacterium Chroococcidiopsis, which forms tough smooth mats over all the beach rock. This cyanobacterium belongs to a newly discovered class called FaRLiP photosynthesisers, which carry out conventional photosynthesis under visible radiation using chlorophyll a (Chl a) but which deactivate most of the Chl a under near infra -red radiation (NIR) and produce chlorophyll f and chlorophyll d which can absorb NIR (700 - 760 nm). These NIR Chl molecules are repositioned in the reaction centres. In addition, an NIR-activated allophycocyanin (a phycobiliprotein) is synthesised and placed in the reaction centres. These FaRLiP cyanobacteria can carry out photosynthesis and primary production when placed under NIR. Here it is shown that in the mats of Chroococcidiopsis on the beach rock the upper layers carry out conventional photosynthesis while the lower layers carry out FaRLiP photosynthesis. It is shown that the FaRLiP-activated lower layers can produce up to 20% of the total photosynthetic primary production.

      Strengths:

      The authors have researched sections of beachrock obtained from the beach rock on Heron Island. The Beach Rock on Heron Island occurs on both sides of the Island lying in a semi-horizontal position slightly sloping upwards toward the Island. At normal high tide, only the upper parts are not submerged. Black crusts occur in the uppermost parts of the beachrock. Brown crusts occur in the intermediate sites and pink crusts occur at the lowest part of the beachrock.

      The crusts are made up largely of cyanobacteria and the major component is a cyanobacterium of one species, tentatively identified by shape, pigmentation, and partial DNA analysis as Chroococcidiopsis.

      In this investigation sections of the beach rock from different levels have been analysed using three techniques:

      (1) Hyperspectral analysis to determine the layout of pigmented cells and their spectra.

      (2) Bioluminescence to determine the spectra of the cells in the sections.

      (3) Oxygen analysis, using luminescence lifetime imaging on special films closely applied to vertical sections of the beachrock.

      (4) Oxygen production from the surface of three-dimensional blocks of beach rock, illuminated with white light or Near Infra Red (NIR) radiation, from above.

      In addition, pigmentation has been analysed by High Performance Liquid Chromatography (HPLC).

      These techniques allow the following conclusions:

      (1) Scytonemin is a main screening compound for UV irradiation.

      (2) Carotenoids also play a part in screening from UV and probably visible radiation.

      (3) The cyanobacteria occur near the rock surface and contain Chl a plus some Chl f and a small amount of Chl d.

      (4) HPLC pigment analysis confirms the presence of Chl a plus Chl f and a small amount of Chl d.

      (5) The deeper layer with FaRLiP cyanobacteria produces oxygen under both visible light and NIR irradiation, with different P vs I curves.

      (6) Using the oxygen chamber to measure oxygen exchange above the beach rock surface, it was shown that high respiration meant that only with the brown samples was significant oxygen released to the water column at lower light levels, i.e. respiration accounted for most of the primary production of oxygen except at the highest visible light intensities. And with NIR much lower levels of oxygen production only breaking compensation significantly in the brown samples.

      (7) FaRLiP primary production was significant in the deeper layer.

      The major new conclusion from these studies is that FaRLiP photosynthesis is a significant factor in this biofilm, and possibly other biofilms. Visible light is mostly absorbed in the upper layers and NIR reaching the lower layers induces FaRLiP photosynthesis and primary production, which can be up to 20% of the total primary production of the film.

      Weaknesses:

      The techniques are sufficient to justify the conclusions, especially the new result that the FaRLiP photosynthesis deeper in the films is surprisingly active with relatively high primary productivity. This is an important conclusion but it must be realised that there is some way to go to polish up the results and gain more quantitative results.

      Firstly the beachrock is a heterogeneous material. So cutting a section leaves a non-homogeneous surface where various sand grains are removed, cut, or not removed. This means that when applying a luminescence film, the results are dependent on the uniformity of the surface or rather the lack of conformity. This needs to be taken into consideration in future studies.

      Furthermore, previous papers have revealed that pits in the beach rock are important sites for FaRLiP cyanobacteria and the paper needs to make clear that these pits were avoided here.

      Secondly, while Chroococcidiopsis is the major alga/cyanobacterium present, other algae/cyanobacteria are present and their presence needs to be factored into the results. In this regard we need more microscopic images of the surface and cross-sections of the beachrock, to reveal the nature of the bacterial and algal organisms.

      Thirdly, it is not clear from this paper how far the identification of Chroococcidiopsis is firm. Presumably preliminary DNA analyses have been carried out on tell-tale genes (rRNA?). At some stage, a complete genome will be needed. Mention should be made about what has been done and what is contemplated.

      Fourthly, the acclimation to FaRLiP is time-dependent. How long does it take in these beach rock sections? And has sufficient notice been taken of this time-dependent process?

      Fifthly, FaRLiP is a sophisticated system as shown by Mascoli et al, 2022. It is activated in NIR by red-shifted allophycocyanin. It is also dependent on the allocation of Chl f and Chl d to special positions in the reaction centre. All this may take some time and be light-dependent. This may explain the curious increase in the slopes of light vs productivity of Fig 4 (Pink and Black) for NIR light.

      The fifth point needs to be taken into account in any rewrite of the paper. The authors assume that the upwardly sloping P vs I curve is explained as follows:<br /> "This can be explained by the light attenuation due to scattering and absorption in the compacted beachrock biofilm, which prevented saturation of NIR-driven photosynthesis in the endolithic layer even at levels of incident light similar to solar irradiation on mid-day exposed beachrock."

      Activation of the FaRLiP system also needs to be considered.

    1. Author response:

      The following is the authors’ response to the current reviews.

      We thank the Reviewers and Editors for the constructive comments, which we believe have significantly improved the quality of our manuscript.


      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      (1) With respect to the predictions, the authors propose that the subjects, depending on their linguistic background and the length of the tone in a trial, can put forward one or two predictions. The first is a short-term prediction based on the statistics of the previous stimuli and identical for both groups (i.e. short tones are expected after long tones and vice versa). The second is a long-term prediction based on their linguistic background. According to the authors, after a short tone, Basque speakers will predict the beginning of a new phrasal chunk, and Spanish speakers will predict it after a long tone.

      In this way, when a short tone is omitted, Basque speakers would experience the violation of only one prediction (i.e. the short-term prediction), but Spanish speakers will experience the violation of two predictions (i.e. the short-term and long-term predictions), resulting in a higher amplitude MMN. The opposite would occur when a long tone is omitted. So, to recap, the authors propose that subjects will predict the alternation of tone durations (short-term predictions) and the beginning of new phrasal chunks (long-term predictions).

      The problem with this is that subjects are also likely to predict the completion of the current phrasal chunk. In speech, phrases are seldom left incomplete. In Spanish is very unlikely to hear a function-word that is not followed by a content-word (and the opposite happens in Basque). On the contrary, after the completion of a phrasal chunk, a speaker might stop talking and a silence might follow, instead of the beginning of a new phrasal chunk.

      Considering that the completion of a phrasal chunk is more likely than the beginning of a new one, the prior endowed to the participants by their linguistic background should make us expect a pattern of results actually opposite to the one reported here.

      We thank the Reviewer #1 for this pertinent comment and the opportunity to address this issue. A very similar concern was also raised by Reviewer #2. Below we try to clarify the motivations that led us to predict that the hypothesized long-term predictions should manifest at the onset (and not within or the end) of a perceptual chunk. 

      Reviewers #1 and #2 contest a critical assumption of our study i.e., the fact that longterm predictions should occur at the beginning of a rhythmic chunk as opposed to its completion. They also contest the prediction deriving from this view i.e., omitting the first sound in a perceptual chunk (short for Spanish, long for Basque) would lead to larger error responses than omitting a later element. They suggest an alternative view: the omission of tones at the end of a perceptual rhythmic chunk would evoke larger error responses than omissions at its onset, as subjects are more likely to predict the completion of the chunk than its beginning. This view predicts an interaction effect in the opposite direction of our findings. 

      While we acknowledge this as a plausible hypothesis, we believe that the current literature provides strong support for our view. Indeed, many studies in the rhythm and music perception literature have investigated the ERP responses to deviant sounds and omissions placed at different positions within rhythmic patterns (e.g., Ladinig et al., 2009; Bouwer et al., 2016; Brochard et al., 2003; Potter et al., 2009; Yabe et al., 2001). For instance, Lading et al., 2009 presented participants with metrical rhythmical sound sequences composed of eight tones. In some deviant sequences, the first or a later tone was omitted. They found that earlier omissions elicited earlier and higher-amplitude MMN responses than later omissions (irrespective of attention). Overall, this and other studies showed that the amplitude of ERP responses are larger when deviants occur at positions that are expected to be the “start” of a perceptual group - “on the beat” in musical terms - and decline toward the end of the chunk. According to some of these studies, the first element of a chunk is particularly important to track the boundaries of temporal sequences, which is why more predictive resources are invested at that position. We believe that this body of evidence provides robust bases for our hypotheses and the directionality of our predictions.

      An additional point that should be considered concerns the amplitude of the prediction error response elicited by the omission. From a predictive coding perspective, the omission of the onset of a chunk should elicit larger error responses because the system is expecting the whole chunk (i.e., two tones/more acoustic information). On the other hand, the omission of the second tone - in the transition between two tones within the chunk - should elicit a smaller error response because the system is expecting only the missing tone (i.e. less acoustic information). 

      Given the importance of these points, we have now included them in the updated version of the paper, in which we try to better clarify the rationale behind our hypothesis (see Introduction section, around the 10th paragraph).

      (2) The authors report an interaction effect that modulates the amplitude of the omission response, but caveats make the interpretation of this effect somewhat uncertain. The authors report a widespread omission response, which resembles the classical mismatch response (in MEG) with strong activations in sensors over temporal regions. Instead, the interaction found is circumscribed to four sensors that do not overlap with the peaks of activation of the omission response.

      We thank the Reviewer for this comment. As mentioned in the provisional response, the approach employed to identify the presence of an interaction effect was conservative: We utilized a non-parametric test on combined gradiometers data, without making a priori assumptions about the location of the effect, and employed small cluster thresholds (cfg.clusteralpha = 0.05) to increase the chances of detecting highly localized clusters with large effect sizes. The fact that the interaction effect arises in a relatively small cluster of sensors does not alter its statistical robustness. It should be also considered that in the present analyses we focused on planar gradiometer data that, compared to magnetometers and axial gradiometers, present more fine-grained spatial resolution and are more suited for picking up relatively small effects. 

      The partial overlap of the cluster with the activation peaks may simply reflect the fact that different sources contribute to the generation of the omission-MMN, which has been reported in several studies (e.g., Zhang et al., 2018; Ross & Hamm, 2020).  We value the Reviewer’s input and are grateful for the opportunity to address these considerations.

      Furthermore, the boxplot in Figure 2E suggests that part of the interaction effect might be due to the presence of two outliers (if removed, the effect is no longer significant). Overall, it is possible that the reported interaction is driven by a main effect of omission type which the authors report, and find consistently only in the Basque group (showing a higher amplitude omission response for long tones than for short tones). Because of these points, it is difficult to interpret this interaction as a modulation of the omission response.

      We thank the Reviewer for the comment and appreciate the opportunity to address these concerns. We have re-evaluated the boxplot in Figure 2E and want to clarify that the two participants mentioned by Reviewer #1, despite being somewhat distant from the rest of the group, are not outliers according to the standard Tukey’s rule. As shown in the figure below, no participant fell outside the upper (Q3+1.5xIQR) and lower whiskers (Q1-1.5xIQR) of the boxplot. 

      Moreover, we believe that the presence of a main effect of omission type does not impact the interpretation of the interaction, especially considering that these effects emerge over distinct clusters of channels (see Fig. 1 C; Supplementary Fig. 2 A). 

      Based on these considerations - and along with the evidence collected in the control study and the source reconstruction data reported in the new version of the manuscript - we find it unlikely that the interaction effect is driven by outliers or by a main effect of omission type. We appreciate the opportunity provided by the Reviewer to address these concerns, as we believe they strengthen the claim that the observed effect is driven by the hypothesized long-term linguistic priors rather than uncontrolled group differences.

      Author response image 1.

      It should also be noted that in the source analysis, the interaction only showed a trend in the left auditory cortex, but in its current version the manuscript does not report the statistics of such a trend.

      We  appreciate  the  Reviewer’s  suggestion  to  incorporate  more comprehensive source analyses. In the new version of the paper, we perform new analyses on the source data using a new Atlas with more fine-grained parcellations of the regions of interests (ROIs) (Brainnetome atlas; Fan et al., 2016) and focusing on peak activity to increase response’s sensitivity in space and time. We therefore invite the Reviewer to read the updated part on source reconstruction included in the Results and Methods sections of the paper.  

      Reviewer #1 (Recommendations For The Authors):

      While I have described my biggest concerns with respect to this work in the public review, here I list more specific points that I hope will help to improve the manuscript. Some of these are very minor, but I hope you will still find them constructive. 

      (1) I understand the difficulties implied in recruiting subjects from two different linguistic groups, but with 20 subjects per group and a between-groups design, the current study is somewhat underpowered. A post-hoc power analysis shows an achieved power of 46% for medium effect sizes (d = 0.5, and alpha = 0.05, one-sided test). A sensitivity analysis shows that the experiment only has 80% power for effect sizes of d = 0.8 and above. It would be important to acknowledge this limitation in the manuscript. 

      We thank the Reviewer for reporting these analyses. It must be noted that our effect of interest was based on Molnar et al.’s (2016) behavioral experiment, in which a sample size of 16 subjects per group was sufficient to detect the perceptual grouping effect. In Yoshida et al., (2010), the perceptual grouping effect emerged with two groups of 20 7–8-month-old Japanese and English-learning infants. Based on these previous findings, we believe that a sample size of 20 participants per group can be considered appropriate for the current MEG study. We clarified these aspects in the Participants section of the manuscript, in which we specified that previous behavioral studies detected the perceptual grouping with similar sample sizes. Moreover, to acknowledge the limitation highlighted by the Reviewer, we also include the power and sensitivity analysis in a note in the same section (see note 2 in the Participants section).

      (2) All the line plots in the manuscript could be made much more informative by adding 95% CI bars. For example, in Figure 4A, the omission response for the long tone departs from the one for the short tone very early. Adding CIs would help to assess the magnitude of that early difference. Error bars are present in Figure 3, but it is not specified what these bars represent. 

      Thanks for the comments. We added the explanation of the error bars in the new version of Figure 3. For the remaining figures, we prefer maintaining the current version of the ERF, as the box-plots accompanying them provide information about the distribution of the effect across participants.

      (3) In the source analysis, there is only mention of an interaction trend in the left auditory cortex, but no statistics are presented. If the authors prefer to mention such a trend, I think it would be important to provide its stats to allow the reader to assess its relevance. 

      We performed new analysis on the source data, all reported in the updated version of the manuscript.

      (4) In the discussion section, the authors refer to the source analysis and state that "the interaction is evident in the left". But if only a statistical trend was observed, this statement would be misleading. 

      We agree with this comment. We invite the Reviewer to check the new part on source reconstruction, in which contrasts going in the same direction of the sensor level data are performed.

      (5) In the discussion the authors argue that "This result highlights the presence of two distinct systems for the generation of auditory" that operate at different temporal scales, but the current work doesn't offer evidence for the existence of two different systems. The effects of long-term priors and short-term priors presented here are not dissociated and instead sum up. It remains possible that a single system is in place, collecting statistics of stimuli over a lifetime, including the statistics experienced during the experiment. 

      Thanks for pointing that out. We changed the sentence above as follows: “This result highlights the presence of an active predictive system that relies on natural sound statistics learned over a lifetime to process incoming auditory input”.

      (6) In the discussion, the authors acknowledge that the omission response has been interpreted both as pure prediction and as pure prediction error. Then they declare that "Overall, these findings are consistent with the idea that omission responses reflect, at least in part, prediction error signals.". However an argument for this statement is not provided. 

      Thanks for pointing out this lack of argument. In the new version of the manuscript, we explained our rationale as follows: “Since sensory predictive signals primarily arise in the same regions as the actual input, the activation of a broader network of regions in omission responses compared to tones suggests that omission responses reflect, at least in part, prediction error signals”.

      (7) In the discussion the authors present an alternative explanation in which both groups might devote more resources to the processing of long events, because these are relevant content words. Following this, they argue that "Independently on the interpretation, the lack of a main effect of omission type in the control condition suggests that the long omission effect is driven by experience with the native language." However as there was no manipulation of duration in the control experiment, a lack of the main effect of omission type there does not rule out the alternative explanation that the authors put forward. 

      This is correct; thanks for noticing it. We removed the sentence above to avoid ambiguities.

      Minor points: 

      (8) The scale of the y-axis in Figure 2C might be wrong, as it goes from 9 to 11 and then to 12. If the scale is linear, the top value should be 13, or the bottom value should be 10. 

      Figure 2C has been modified accordingly, thanks for noticing the error.

      (9) There is a very long paragraph starting on page 7 and ending on page 8. Toward the end of the paragraph, the analysis of the control condition is presented. That could start a new paragraph.

      Thanks for the suggestion. We modified the manuscript as suggested.

      Reviewer #2 (Public Review):

      (1) Despite the evidence provided on neural responses, the main conclusion of the study reflects a known behavioral effect on rhythmic sequence perceptual organization driven by linguistic background (Molnar et al. 2016, particularly). Also, the authors themselves provide a good review of the literature that evidences the influence of longterm priors in neural responses related to predictive activity. Thus, in my opinion, the strength of the statements the authors make on the novelty of the findings may be a bit far-fetched in some instances.

      Thanks for the suggestion. A similar point was also advanced by Reviewer 1. In general, we believe our work speaks about the predictive nature of such experiencedependent  effects, and show that these linguistic priors shape sensory processes at very early stages. This is discussed in the sixth and seventh paragraphs of the Discussion section. In the new version of the article, we modified some statements and tried to make them more coherent with the scope of the present work. For instance, we changed "This result highlights the presence of two distinct systems for the generation of auditory predictive models, one relying on the transition probabilities governing the recent past, and another relying on natural sound statistics learned over a lifetime“ with “This result highlights the presence of an active predictive system that relies on natural sound statistics learned over a lifetime to process incoming auditory input”.

      (2) Albeit the paradigm is well designed, I fail to see the grounding of the hypotheses laid by the authors as framed under the predictive coding perspective. The study assumes that responses to an omission at the beginning of a perceptual rhythmic pattern will be stronger than at the end. I feel this is unjustified. If anything, omission responses should be larger when the gap occurs at the end of the pattern, as that would be where stronger expectations are placed: if in my language a short sound occurs after a long one, and I perceptually group tone sequences of alternating tone duration accordingly, when I hear a short sound I will expect a long one following; but after a long one, I don't necessarily need to expect a short one, as something else might occur.

      A similar point was advanced by Reviewer #1. We tried to clarify the rationale behind our hypothesis. Please refer to the response provided to the first comment of Reviewer #1 above.

      (3) In this regard, it is my opinion that what is reflected in the data may be better accounted for (or at least, additionally) by a different neural response to an omission depending on the phase of an underlying attentional rhythm (in terms of Large and Jones rhythmic attention theory, for instance) and putative underlying entrained oscillatory neural activity (in terms of Lakatos' studies, for instance). Certainly, the fact that the aligned phase may differ depending on linguistic background is very interesting and would reflect the known behavioral effect.

      We thank the Reviewer for this comment. We explored in more detail the possibility that the aligned phase may differ depending on linguistic background, which is indeed a very interesting hypothesis. In the phase analyses reported below we focused on the instantaneous phase angle time locked to the onset of short and long tones presented in the experiment.

      In short, we extracted time intervals of two seconds centered on the onset of the tones for each participant (~200 trials per condition) and using a wavelet transform (implemented in Fieldtrip ft_freqanalysis) we targeted the 0.92 Hz frequency that corresponds to the rhythm of presentation of our pairs of tones. We extracted the phase angle for each time point and using the circular statistics toolbox implemented in Matlab we computed the Raleigh z scores across all the sensor space for each tone (long and short tone) and group (Spanish (Spa) dominants and Basque (Eus) dominants). This method evaluates the instantaneous phase clustering at a specific time point, thus evaluating the presence of a specific oscillatory pattern at the onset of the specific tone. 

      Author response image 2.

      Here we observe that the phase clustering was stronger in the right sensors for both groups. The critical point is to evaluate the phase angle (estimated in phase radians) for the two groups and the two tones and see if there are statistical differences. We focused first on the sensor with higher clustering (right temporal MEG1323) and observed very similar phase angles for the two groups both for long and short tones (see image below). We then focused on the four left fronto-temporal sensor pairs who showed the significant interaction: here we observed one sensor (MEG0412) with different effects for the two groups (interaction group by tone was significant, p=0.02): for short tones the “Watson (1961) approximation U2 test” showed a p-value of 0.11, while for long tones the p-value was 0.03 (after correction for multiple comparisons). 

      Overall, the present findings suggest the tendency to phase aligning differently in the two groups to long and short tones in the left fronto-temporal hemisphere. However, the effect could be detected only in one gradiometer sensor and it was not statistically robust. The effect in the right hemisphere was statistically more robust, but it was not sensitive to group language dominance. 

      Due to the inconclusive nature of these analyses regarding the role of language experience in shaping the phase alignment to rhythmic sound sequences, we prefer to keep these results in the public review rather than incorporating them in the article.  Nonetheless, we believe that this decision does not undermine the main finding that the group differences in the MMN amplitude are driven by long-term predictions – especially in light of the many studies indicating the MMN as a putative index of prediction error (e.g., Bendixen et al., 2012; Heilbron and Chait, 2018). Moreover, as suggested in the preliminary reply, despite evoked responses and oscillations are often considered distinct electrophysiological phenomena, current evidence suggests that these phenomena are interconnected (e.g., Studenova et al., 2023). In our view, the hypotheses that the MMN reflects differences in phase alignment and long-term prediction errors are not mutually exclusive.

      Author response image 3.

      (4) Source localization is performed on sensor-level significant data. The lack of  sourcelevel statistics weakens the conclusions that can be extracted. Furthermore, only the source reflecting the interaction pattern is taken into account in detail as supporting their hypotheses, overlooking other sources. Also, the right IFG source activity is not depicted, but looking at whole brain maps seems even stronger than the left. To sum up, source localization data, as informative as it could be, does not strongly support the author's claims in its current state. 

      A similar comment was also advanced by Reviewer #1 (comment 2). We appreciate the suggestion to incorporate more comprehensive source analyses. In the new version of the paper, we perform new analyses on the source data using a new Atlas with more fine-grained parcellations of the ROIs, and focusing on peak activity to increase response’s sensitivity in space and time. We therefore invite the Reviewer to read the updated part on source reconstruction included in the Results and Methods sections of the paper. 

      In the article, we report only the source reconstruction data from ROIs in the left hemisphere, because it is there that the interaction effect arises at the sensor level. However, we also explored the homologous regions in the right hemisphere, as requested by the Reviewer. A cluster-based permutation test focusing on the interaction between language group and omission type was performed on both the right STG and IFG data. No significant interaction emerged in any of these regions. Below a plot of the source activity time series over ROIs in the right STG and IFG. 

      Author response image 4.

      Reviewer #2 (Recommendations For The Authors):

      In this set of private recommendations for the authors, I will outline a couple of minor comments and try to encourage additional data analyses that, in my opinion, would strengthen the evidence provided by the study. 

      (1) As I noted in the public review, I believe an oscillatory analysis of the data would, on one hand, provide stronger support for the behavioral effect of rhythmic perceptual organization given the lack of behavioral direct evidence; and, on the other hand, provide evidence (to be discussed if so) for a role of entrained oscillation phase in explaining the different pattern of omission responses. One analysis the authors could try is to measure the phase angle of an oscillation, the frequency of which relates to the length of the binary pattern, at the onset of short and long tones, separately, and compare it across groups. Also, single trials of omission responses could be sorted according to that phase. 

      Thanks for the suggestion. Please see phase analyses reported above.

      (2) I wonder why source activity for the right IFG was not shown. I urge the authors to provide and discuss a more complete picture of the source activity found. Given the lack of source statistics (which could be performed), I find it a must to give an overall view. I find it so because I believe the distinction between perceptual grouping effects due to inherent acoustic differences across languages or semantic differences is so interesting. 

      Thanks again for the invitation to provide a more complete picture of the source activity data. As mentioned in the response above, we invite the Reviewer to read the new related part included in the Results and Methods sections of the paper. In our updated source reconstruction analysis, we find that some regions around the left STG show a pattern that resembles the one found at the sensor-level, providing further support for the “acoustic” (rather than syntactic/semantic) nature of the effect. 

      We did not report ROI analysis on the right hemisphere because the interaction effect at sensor level emerged on the left hemisphere. Yet, we included a summary of this analysis in the public response above. 

      (3) Related to this, I have to acknowledge I had to read the whole Molnar et al. (2016) study to find the only evidence so far that, acoustically, in terms of sound duration, Basque and Spanish differ. This was hypothesized before but only at Molnar, an acoustic analysis is performed. I think this is key, and the authors should give it a deeper account in their manuscript. I spend my review of this study thinking, well, but when we speak we actually bind together different words and the syllabic structure does not need to reflect the written one, so maybe the effect is due to a high-level statistical prior related to the content of the words... but Molnar showed me that actually, acoustically, there's a difference in accent and duration: "Taken together, Experiments 1a and 1b show that Basque and Spanish exhibit the predicted differences in terms of the position of prosodic prominence in their phonological phrases (Basque: trochaic, Spanish: iambic), even though the acoustic realization of this prominence involves not only intensity in Basque but duration, as well. Spanish, as predicted, only uses duration as a cue to mark phrasal prosody." 

      Thanks for the suggestion, the distinction in terms of sound duration in Spanish and Basque reported by Molnar is indeed very relevant for the current study. 

      We add a few sentences to highlight the acoustic analysis by Molnar and the consequent acoustic nature of the reported effect.

      In the introduction: “Specifically, the effect has been proposed to depend on the quasiperiodic alternation of short and long auditory events in the speech signal – reported in previous acoustic analyses (Molnar et al., 2016) – which reflect the linearization of function words (e.g., articles, prepositions) and content words (e.g., nouns, adjectives, verbs).”

      In the discussion, paragraph 3, we changed “We hypothesized that this effect is linked to a long-term “duration prior” originating from the syntactic function-content word order of language, and specifically, from its acoustic consequences on the prosodic structure” with “We hypothesized that this effect is linked to a long-term “duration prior” originating from the acoustic properties of the two languages, specifically from the alternation of short and long auditory events in their prosody”.

      In the discussion, end of paragraph eight: “The reconstruction of cortical sources associated with the omission of short and long tones in the two groups showed that an interaction effect mirroring the one at the sensor level was present in the left STG, but not in the left IFG (fig. 3, B, C, D). Pairwise comparisons within different ROIs of the left STG indicated that the interaction effect was stronger over primary (BA 41/42) rather than associative (BAs 22) portions of the auditory cortex. Overall, these results suggest that the “duration prior” is linked to the acoustic properties of a given language rather than its syntactic configurations”.

      Now, some minor comments: 

      (1) Where did the experiments take place? Were they in accordance with the Declaration of Helsinki? Did participants give informed consent? 

      All the requested information has been added to the updated version of the manuscript. Thanks for pointing out this.

      (2) The fixed interval should be called inter-stimulus interval. 

      Thanks for pointing this out. We changed the wording as suggested.

      (3) The authors state that "Omission responses allow to examine the presence of putative error signals decoupled from bottom-up sensory input, offering a critical test for predictive coding (Walsh et al 2020, Heilbron and Chait, 2018).". However the way omission responses are computed in their study is by subtracting the activity from the previous tone. This necessarily means that in the omission activity analyzed, there's bottom-up sensory input activity. As performing another experiment with a control condition in which a sequence of randomly presented tones with different durations to compare directly the omission activity in both sequences (experimental and control) is possibly too demanding, I at least urge the authors to incorporate the fact that their omission responses do reflect also tone activity. And consider, for future experiments, the inclusion of further control conditions. 

      Thanks for the opportunity to clarify this aspect. Actually, the way we computed the omission MMN is not by subtracting the activity of the previous tone from the omission, but by subtracting the activity of randomly selected tones across the whole experiment. That is, we randomly selected around 120 long and short tones (i.e., about the same number as the omissions); we computed the ERF for the long and short tones; we subtracted these ERF from the ERF of the corresponding short and long omissions. We clarified these aspects in both the Materials and Methods (ERF analysis paragraph) and Results section.

      Moreover, the subtraction strategy - which is the standard approach to calculate the MMN - allows to handle possible neural carryover effects arising from the perception of the tone preceding the omission.

      The sentence "Omission responses allow to examine the presence of putative error signals decoupled from bottom-up sensory input, offering a critical test for predictive coding (Walsh et al 2020, Heilbron and Chait, 2018)." simply refer to the fact that the error responses resulting from an omission are purely endogenous, as omissions are just absence of an expected input (i.e., silence). On the other hand, when a predicted sequence of tones is disrupted by an auditory deviants (e.g., a tone with a different pitch or duration than the expected one), the resulting error response is not purely endogenous, but it partially includes the response to the acoustic properties of the deviant.

      (4) When multiple clusters emerged from a comparison, only the most significant cluster was reported. Why? 

      We found more than one significant cluster only in the comparison between pure omissions vs tones (figure 2 A, B). The additional significant cluster from this comparison is associated with a P-value of 0.04, emerges slightly earlier in time, and goes in the same direction as the cluster reported in the paper i.e., larger ERF responses for omission vs tones. We added a note specifying the presence of this second cluster, along with a figure on the supplementary material (Supplementary Fig. 1 A, B).

      (5) Fig 2, if ERFs are baseline corrected -50 to 0ms, why do the plots show pre-stimulus amplitudes not centered at 0? 

      This is because we combined the latitudinal and longitudinal gradiometers on the ERF obtained after baseline correction, by computing the root mean square of the signals at each sensor position (see also  https://www.fieldtriptoolbox.org/example/combineplanar_pipelineorder/). This information is reported in the methods part of the article.

      (6) Fig 2, add units to color bars. 

      Sure.

      (7) Fig 2 F and G, put colorbar scale the same for all topographies. 

      Sure, thanks for pointing this out.

      (8) The interaction effect language (Spanish; Basque) X omission type (short; long) appears only in a small cluster of 4 sensors not located at the locations with larger amplitudes to omissions. Authors report it as left frontotemporal, but it seems to me frontocentral with a slight left lateralization.

      (1) the fact that the cluster reflecting the interaction effect does not overlap with the peaks of activity is not surprising in our view. Many sources contribute to the generation of the MMN. The goal of our work was to establish whether there is also evidence for a long-term system (among the many) contributing to this. That is why we perform a first analysis on the whole omission response network (likely including many sources and predictive/attentional systems), and then we zoom in and focus on our hypothesized interaction. We never claim that the main source underlying the omissionMMM is the long-term predictive system. 

      (2) The exact location of those sensors is at the periphery of the left-hemisphere omission response, which mainly reflects activity from the left temporal regions. The sensor location of this cluster could be influenced by multiple factors, including (i) the direction of the source dipoles determining an effect; (ii) the combination of multiple sources contributing to the activity measured at a specific sensor location, whose unmixing could be solved only with a beamforming source approach. Based on the whole evidence we collected also in the source analyzes we concluded that the major contributors to the sensor-level interaction are emerging from both frontal and temporal regions.

      Reviewer #3 (Public Review):

      (1) The main weaknesses are the strength of the effects and generalisability. The sample size is also relatively small by today's standards, with N=20 in each group. Furthermore, the crucial effects are all mostly in the .01>P<.05 range, such as the crucial interaction P=.03. It would be nice to see it replicated in the future, with more participants and other languages. It would also have been nice to see behavioural data that could be correlated with neural data to better understand the real-world consequences of the effect.

      We appreciate the positive feedback from Reviewer #3. We agree that it would be nice to see this study replicated in the future with larger sample sizes and a behavioral counterpart. Below are a few comments concerning the weakness highlighted: 

      (i) Concerning the sample size: a similar point was raised by Reviewer #1. We report our reply as presented above: “Despite a sample size of 20 participants per group can be considered relatively small for detecting an effect in a between-group design, it must be noted that our effect of interest was based on Molnar et al.’s (2016) experiment, where a sample size of 16 subjects per group was sufficient to detect the perceptual grouping effect. In Yoshida et al., 2010, the perceptual grouping effect arose with two groups of 20 7–8-month-old Japanese and English-learning infants. Based on these findings, we believe that a sample size of 20 participants per group can be considered appropriate for the current study”. We clarified these aspects in the new version of the manuscript.

      (ii) We believe that the lack of behavioral data does not undermine the main findings of this study, given the careful selection of the participants and the well-known robustness of the perceptual grouping effect (e.g., Iversen 2008; Yoshida et al., 2010; Molnar et al. 2014; Molnar et al. 2016). As highlighted by Reviewer #2, having Spanish and Basque dominant “speakers as a sample equates that in Molnar et al. (2016), and thus overcomes the lack of direct behavioral evidence for a difference in rhythmic grouping across linguistic groups. Molnar et al. (2016)'s evidence on the behavioral effect is compelling, and the evidence on neural signatures provided by the present study aligns with it”. (iii) Regarding the fact that the “crucial effects are all mostly in the .01>P<.05 range”: we want to stress that the approach we used to detect the interaction effect was conservative, using a cluster-based permutation approach with no a priori assumptions about the location of the effect. The robustness of our approach has also been highlighted by Reviewer 2: “Data analyses. Sound, state-of-the-art methodology in the event-related field analyses at the sensor level.” In sum, despite some crucial effects being in the .01>P<.05 range, we believe that the statistical soundness of our analysis, combined with the lack of effect in the control condition, provides compelling evidence for our H1.

      Reviewer #3 (Recommendations For The Authors):

      Figures - Recommend converting all diagrams and plots to vector images to ensure they remain clear when zoomed in the PDF format. 

      Sure, thanks. 

      Figure 1: To improve clarity, the representation of sound durations in panels C and D should be revisited. The use of quavers/eighth notes can be confusing for those familiar with musical notation, as they imply isochrony. If printed in black and white, colour distinctions may be lost, making it difficult to discern the different durations. A more universal representation, such as spectrograms, might be more effective. 

      Thanks for the suggestion. It’s true that the quavers/eighth notes might be confusing in that respect. However, we find this notation as a relatively standard approach to define paradigms in auditory neuroscience, see for instance the two papers below. In the new version of the manuscript, we specified in the captions under the figure that the notes refer to individual tones, in order to avoid ambiguities.

      - Wacongne, C., Labyt, E., Van Wassenhove, V., Bekinschtein, T., Naccache, L., & Dehaene, S. (2011). Evidence for a hierarchy of predictions and prediction errors in human cortex. Proceedings of the National Academy of Sciences, 108(51), 20754-20759.

      - Dehaene, S., Meyniel, F., Wacongne, C., Wang, L., & Pallier, C. (2015). The neural representation of sequences: from transition probabilities to algebraic patterns and linguistic trees. Neuron, 88(1), 2-19.

      Figure 2 : In panel C of Figure 2, please include the exact p-value for the interaction observed. Refrain from using asterisks or "n.s." and opt for exact p-values throughout for the sake of clarity. 

      Thank you for your suggestion. We have included the exact p-value for the interaction in panel C of Figure 2. However, for the remaining figures, we have chosen to maintain the use of asterisks and "n.s.". We would like our pictures to convey the key findings concisely, while the numerical details can be found in the article text. The caption below the image also provides guidance on the interpretation of the p-values: (statistical significance: **p < 0.01, *p < 0.05, and ns p > 0.05).  

      Figure 3 Note typo "Omission reponse"

      Fixed. Thanks for noticing the typo. 

      A note: we moved the figure reflecting the main effect of long tone omission and the lack of main effect of language background (Figure 4 in the previous manuscript) in the supplementary material (Supplementary Figure 2).

      References

      Bendixen, A., SanMiguel, I., & Schröger, E. (2012). Early electrophysiological indicators for predictive processing in audition: a review. International Journal of Psychophysiology, 83(2), 120-131.

      Heilbron, M., & Chait, M. (2018). Great expectations: is there evidence for predictive coding in auditory cortex?. Neuroscience, 389, 54-73.

      Iversen, J. R., Patel, A. D., & Ohgushi, K. (2008). Perception of rhythmic grouping depends on auditory experience. The Journal of the Acoustical Society of America, 124(4), 22632271.

      Molnar, M., Lallier, M., & Carreiras, M. (2014). The amount of language exposure determines nonlinguistic tone grouping biases in infants from a bilingual environment. Language Learning, 64(s2), 45-64.

      Molnar, M., Carreiras, M., & Gervain, J. (2016). Language dominance shapes non-linguistic rhythmic grouping in bilinguals. Cognition, 152, 150-159.

      Ross, J. M., & Hamm, J. P. (2020). Cortical microcircuit mechanisms of mismatch negativity and its underlying subcomponents. Frontiers in Neural Circuits, 14, 13.

      Simon, J., Balla, V., & Winkler, I. (2019). Temporal boundary of auditory event formation: An electrophysiological marker. International Journal of Psychophysiology, 140, 53-61.

      Studenova, A. A., Forster, C., Engemann, D. A., Hensch, T., Sander, C., Mauche, N., ... & Nikulin, V. V. (2023). Event-related modulation of alpha rhythm explains the auditory P300 evoked response in EEG. bioRxiv, 2023-02.

      Yoshida, K. A., Iversen, J. R., Patel, A. D., Mazuka, R., Nito, H., Gervain, J., & Werker, J. F. (2010). The development of perceptual grouping biases in infancy: A Japanese-English cross-linguistic study. Cognition, 115(2), 356-361.

      Zhang, Y., Yan, F., Wang, L., Wang, Y., Wang, C., Wang, Q., & Huang, L. (2018). Cortical areas associated with mismatch negativity: A connectivity study using propofol anesthesia. Frontiers in Human Neuroscience, 12, 392.

      Ladinig, O., Honing, H., Háden, G., & Winkler, I. (2009). Probing attentive and preattentive emergent meter in adult listeners without extensive music training. Music Perception, 26(4), 377-386. 

      Brochard, R., Abecasis, D., Potter, D., Ragot, R., & Drake, C. (2003). The “ticktock” of our internal clock: Direct brain evidence of subjective accents in isochronous sequences. Psychological Science, 14(4), 362-366.

      Potter, D. D., Fenwick, M., Abecasis, D., & Brochard, R. (2009). Perceiving rhythm where none exists: Event-related potential (ERP) correlates of subjective accenting. Cortex, 45(1), 103-109.

      Bouwer, F. L., Werner, C. M., Knetemann, M., & Honing, H. (2016). Disentangling beat perception from sequential learning and examining the influence of attention and musical abilities on ERP responses to rhythm. Neuropsychologia, 85, 80-90.

    2. eLife assessment

      This study presents important observations about how the human brain uses long-term priors (acquired during our lifetime of listening) to make predictions about expected sounds - an open question in the field of predictive processing. The evidence presented is solid and based on state-of-the-art statistical analysis, but limited by a relatively low N and low magnitude for the interaction effect.

    3. Reviewer #1 (Public Review):

      Summary:

      In this work, the authors study whether the human brain uses long term priors (acquired during our lifetime) regarding the statistics of auditory stimuli to make predictions respecting auditory stimuli. This is an important open question in the field of predictive processing.

      To address this question, the authors cleverly profit from the naturally existing differences in two linguistic groups. While speakers of Spanish use phrases in which function-words (short words like, articles and prepositions) are followed by content-words (longer words like nouns, adjectives and verbs), speakers of Basque use phrases with the opposite order. Because of this, speakers of Spanish usually hear phrases in which short words are followed by longer words, and speakers of Basque experience the opposite. This difference in the order of short and longer words is hypothesized to result in a long term duration prior that is used to make predictions regarding the likely durations of incoming sounds, even if they are not linguistic in nature.

      To test this, the authors used MEG to measure the mismatch responses (MMN) elicited by the omission of short and long tones that were presented in alternation. The authors report an interaction between the language background of the participants (Spanish, Basque) and the type of omission MMN (short, long), which goes in line with their predictions. They supplement these results with a source level analysis.

      Strengths:

      This work has many strengths. To test the main question, the authors profit from naturally occurring differences in the everyday auditory experiences of two linguistic groups, which allows to test the effect of putative auditory priors consolidated over the years. This is a direct way of testing the effect of long term priors.

      The fact that the priors in question are linguistic and that the experiment was conducted using non-linguistic stimuli (i.e. simple tones), allows to test if these long term priors generalize across auditory domains.

      The experimental design is elegant and the analysis pipeline appropriate. This work is very well written. In particular the introduction and discussion sections are clear and engaging. The literature review is complete.

      Weaknesses:

      The authors report a widespread omission response, which resembles the classical mismatch response (in MEG planar gradiometers) with strong activations in sensors over temporal regions. However the interaction reported is circumscribed to four sensors that do not overlap with the peaks of activation of the omission response.

    4. Reviewer #2 (Public Review):

      Summary:

      Morucci et al. tested the influence of linguistic prosody long-term priors in forming predictions about simple acoustic rhythmic tone sequences composed of alternating tone duration, by violating context-dependent short-term priors formed during sequence listening. Spanish and Basque participants were selected due to the different rhythmic prosody of the two languages (functor-initial vs. Functor final, respectively), despite a common cultural background. The authors found that neuromagnetic responses to casual tone omissions reflected the linguistic prosody pattern of the participant's dominant language: in Spanish speakers, omission responses were larger to short tones, whereas in Basque speakers, omission responses were larger to long tones. Source localization of these responses revealed this interaction pattern in the left auditory cortex, which the authors interpret as reflecting a perceptual bias due to acoustic cues (inherent linguistic rhythms, rather than linguistic content). Importantly, this pattern was not found when the rhythmic sequence entailed pitch, rather than duration, cues. To my knowledge, this is the first study providing neural signatures of a known behavioral effect linking ambiguous rhythmic tone sequence perceptual organization to linguistic experience.

      The conclusions of the study are well supported by the data. The hypotheses, albeit allowing alternative perspectives, are well justified according to the existing literature. Albeit with inconclusive results, additional analyses to test entrained oscillatory activity to the perceived rhythms have been performed, which adds explanatory power to the study.

      Strengths:

      (1) The choice of participants. The bilingual population of the Basque country is perfect for performing studies which need to control for cultural and socio-economic background while having profound linguistic differences. In this sense, having dominant Basque speakers as a sample equates that in Molnar et al. (2016), and thus overcomes the lack of direct behavioral evidence for a difference in rhythmic grouping across linguistic groups. Molnar et al. (2016)'s evidence on the behavioral effect is compelling, and the evidence on neural signatures provided by the present study aligns with it.

      (2) The experimental paradigm. It is a well designed acoustic sequence, which considers aspects such as gap length insertion, to be able to analyze omission responses free from subsequent stimulus-driven responses, and which includes a control sequence which uses pitch instead of duration as a cue to rhythmic grouping, which provides a stronger case for the differences found between groups to be due to prosodic duration cues.

      (3) Data analyses. Sound, state-of-the-art methodology in the event-related field analyses at the sensor and source levels.

      Weaknesses:

      (1) The main conclusion of the study reflects a known behavioral effect on rhythmic sequence perceptual organization driven by linguistic background (Molnar et al. 2016, particularly) and, thus, the novelty of the findings is restricted to neural activity evidence.

      (2) Although the paradigm is well designed, there are alternative views in formulating the hypotheses. For instance, one could argue that, according to predictive coding views, omission responses should be larger when the gap occurs at the end of the pattern, as that would be where stronger expectations are placed. However, the authors provide good justification based on previous literature for the expectation of larger omission responses at the downbeat of a rhythmic pattern.

    5. Reviewer #3 (Public Review):

      Summary:

      The paper investigates the effects of long-term linguistic experience on early auditory processing, a subject that has been relatively less studied compared to short-term influences. Using MEG, the study examines brain responses to auditory stimuli in speakers of Spanish and Basque, whose syntactic rules provide different degrees of exposure to durational patterns (long-short vs short-long). The findings suggest that both long-term language experience as well as short-term transitional probabilities can shape auditory predictive coding for non-linguistic sound sequences, evidenced by differences in mismatch negativity amplitudes localised to left auditory cortex.

      Strengths:

      The study integrates linguistics and auditory neuroscience in an interesting interdisciplinary way that may interest linguists as well as neuroscientists. The fact that long-term language experience affects early auditory predictive coding is important for understanding group and individual differences in domain-general auditory perception. It has importance for neurocognitive models of auditory perception (e.g. inclusion of long-term priors), and will be of interest to researchers in linguistics, auditory neuroscience, and the relationship between language and perception. The inclusion of a control condition based on pitch is also a strength.

      Weaknesses:

      The main weaknesses are the strength of the effects and generalisability. Only two languages were examined, Spanish and Basque. The sample size is also relatively small by today's standards, with N=20 in each group. Furthermore, the crucial effects are all mostly in the .01>P<.05 range, such as the crucial interaction P=.03, although I note the methods used to derive the results are sound and state-of-the-art. It would be nice to see it replicated in the future, with more participants and other languages. It would also have been nice to see behavioural data that could be correlated with neural data to better understand the real-world consequences of the effect.

    1. Reviewer #3 (Public Review):

      Summary:

      This study aims to understand gene regulation of the plant bacterial pathogen Pseudomonas syringae. Although the function of some TFs has been characterized in this strain, a global picture of the gene regulatory network remains elusive. The authors conducted a large-scale ChIP-seq analysis, covering 170 out of 301 TFs of this strain, and revealed gene regulatory hierarchy with functional validation of some previously uncharacterized TFs.

      Strength:

      - This study provides one of the largest ChIP-seq datasets for a single bacterial strain, covering more than half of its TFs. This impressive resource enabled comprehensive systems-level analysis of the TF hierarchy.<br /> - This study identified novel gene regulation and function with validations through biochemical and genetic experiments.<br /> - The authors conducted broad analyses including comparisons between different bacterial strains, providing further insights into the diversity and conservation of gene regulatory mechanisms.

    2. Reviewer #2 (Public Review):

      Summary:

      The phytopathogenic bacterium Pseudomonas syringae is comprised of many pathovars with different host plant species and has been used as a model organism to study bacterial pathogenesis in plants. Transcriptional regulation is key to plant infection and adaptation to host environments by this bacterium. However, researches have focused on limited number of transcription factors (TFs) that regulate virulence-related pathways. Thus, a comprehensive, systems-level understanding of regulatory interactions between transcription factors in P. syringae has not been achieved.

      This study by Sun et al performed ChIP-seq analysis of 170 out of 301 TFs in P. syringae pv. syringae 1448A and used this unique dataset to infer transcriptional regulatory networks in this bacterium. The network analyses revealed hierarchical interactions between TFs, various network motifs, and co-regulation of target genes by TF pairs, which collectively mediate information flow. As discussed, the structure and properties of the P. syringae transcriptional regulatory networks are somewhat different from those identified in humans, yeast, and E. coli, highlighting the significance of this study. Further, the authors made use of the P. syringae transcriptional regulatory networks to find TFs of unknown functions to be involved in virulence-related pathways. For some of these TFs, their target specificity and biological functions, such as motility and biofilm formation, were experimentally validated. Of particular interest is the finding that despite conservation of TFs between P. syringae pv. syringae 1448A, P. syringae pv. tomato DC3000, P. syringae pv. syringae B728a, and P. syringae pv. actinidiae C48, some of the conserved TFs show different repertoires of target genes in these four P. syringae strains.

      Strengths:

      This study presents a systems-level analysis of transcriptional regulatory networks in relation to P. syringae virulence and metabolism, highlights differences in transcriptional regulatory landscapes of conserved TFs between different P. syringae strains, and develops a user-friendly database for mining the ChIP-seq data generated in this study. These findings and resources will be valuable to researchers in the fields of systems biology, bacteriology, and plant-microbe interactions.

      Weaknesses:

      No major weaknesses were found, but some of the results may need to be interpreted with caution. ChIP-seq was performed with bacterial strains overexpressing TFs. This may cause artificial binding of TFs to promoters which may not occur when TFs are expressed at physiological levels. Another caution is applied to the interpretation of the biological functions of TFs during plant infection, as biological roles of the tested TFs are mostly based on in vitro experiments.

      This work advances our understanding of transcriptional regulation of virulence and metabolic pathways in plant pathogenic bacteria. Solid evidence for the claims is provided by computational analysis of newly generated data on the genome-wide binding of 170 transcription factors to their target genes, together with experimental validation of the biological functions of some of these transcription factors. The findings and resources from this study will be valuable to researchers in the fields of systems biology, bacteriology, and plant-microbe interactions.

    3. eLife assessment

      This work advances our understanding of transcriptional regulation of virulence and metabolic pathways in plant pathogenic bacteria. Solid evidence for the claims is provided by computational analysis of newly generated data on the genome-wide binding of 170 transcription factors to their target genes, together with experimental validation of the biological functions of some of these transcription factors. The findings and resources from this study will be valuable to researchers in the fields of systems biology, bacteriology, and plant-microbe interactions.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this work, the authors provide a comprehensive description of transcriptional regulation in Pseudomonas syringae by investigating the binding characteristics of various transcription factors. They uncover the hierarchical network structure of the transcriptome by identifying top-, middle-, and bottom-level transcription factors that govern the flow of information in the network. Additionally, they assess the functional variability and conservation of transcription factors across different strains of P. syringae by studying DNA-binding characteristics. These findings notably expand our current knowledge of the P. syringae transcriptome.

      The findings associated with crosstalk between transcription factors and pathways, and the diversity of transcription factor functions across strains provide valuable insights into the transcriptional regulatory network of P. syringae. However, these results are at times underwhelming as their significance is unclear. This study would benefit from a discussion of the implications of transcription factor crosstalk on the functioning of the organism as a whole. Additionally, the implications of variability in transcription factor functions on the phenotype of the strains studied would further this analysis.<br /> Overall, this manuscript serves as a key resource for researchers studying the transcriptional regulatory network of P. syringae.

      Thank you for your positive comments.

      Reviewer #2 (Public Review):

      Summary:

      The phytopathogenic bacterium Pseudomonas syringae is comprised of many pathovars with different host plant species and has been used as a model organism to study bacterial pathogenesis in plants. Transcriptional regulation is key to plant infection and adaptation to host environments by this bacterium. However, researchers have focused on a limited number of transcription factors (TFs) that regulate virulence-related pathways. Thus, a comprehensive, systems-level understanding of regulatory interactions between transcription factors in P. syringae has not been achieved.

      This study by Sun et al performed ChIP-seq analysis of 170 out of 301 TFs in P. syringae pv. syringae 1448A and used this unique dataset to infer transcriptional regulatory networks in this bacterium. The network analyses revealed hierarchical interactions between TFs, various network motifs, and co-regulation of target genes by TF pairs, which collectively mediate information flow. As discussed, the structure and properties of the P. syringae transcriptional regulatory networks are somewhat different from those identified in humans, yeast, and E. coli, highlighting the significance of this study. Further, the authors made use of the P. syringae transcriptional regulatory networks to find TFs of unknown functions to be involved in virulence-related pathways. For some of these TFs, their target specificity and biological functions, such as motility and biofilm formation, were experimentally validated. Of particular interest is the finding that despite conservation of TFs between P. syringae pv. syringae 1448A, P. syringae pv. tomato DC3000, P. syringae pv. syringae B728a, and P. syringae pv. actinidiae C48, some of the conserved TFs show different repertoires of target genes in these four P. syringae strains.

      Thank you for your positive comments.

      Strengths:

      This study presents a systems-level analysis of transcriptional regulatory networks in relation to P. syringae virulence and metabolism, and highlights differences in transcriptional regulatory landscapes of conserved TFs between different P. syringae strains, and develops a user-friendly database for mining the ChIP-seq data generated in this study. These findings and resources will be valuable to researchers in the fields of systems biology, bacteriology, and plant-microbe interactions.

      Thank you for your positive comments.

      Weaknesses:

      No major weaknesses were found, but some of the results may need to be interpreted with caution. ChIP-seq was performed with bacterial strains overexpressing TFs. This may cause artificial binding of TFs to promoters which may not occur when TFs are expressed at physiological levels. Another caution is applied to the interpretation of the biological functions of TFs. The biological roles of the tested TFs are based on in vitro experiments. Thus, functional relevance of the tested TFs during plant infection and/or survival under natural environmental conditions remains to be demonstrated.

      Thank you for your comments, and we agree with the reviewer. To eliminate the artificial binding of TFs, we performed EMSA to verify the analyzed targets. Our EMSA results confirmed the analyzed binding peaks.

      For the verification experiments of the biological functions of TFs, we also performed in vivo motility assay and biofilm production assay (Figures 3b-d). To further detect the biological functions of TFs, we performed plant infection assay of TF PSPPH2193 under natural environmental condition (bean leaves). As shown in Figures S6c and g, both the motility and the virulence of P. syringae in ∆PSPPH2193 strain was significantly reduced compared with WT strain. These results showed that TF PSPPH2193 positively regulated the pathogenicity of P. syringae via modulating the bacterial motility.

      Reviewer #3 (Public Review):

      Summary:

      This study aims to understand gene regulation of the plant bacterial pathogen Pseudomonas syringae. Although the function of some TFs has been characterized in this strain, a global picture of the gene regulatory network remains elusive. The authors conducted a large-scale ChIP-seq analysis, covering 170 out of 301 TFs of this strain, and revealed gene regulatory hierarchy with functional validation of some previously uncharacterized TFs.

      Thank you for your positive comments.

      Strengths:

      - This study provides one of the largest ChIP-seq datasets for a single bacterial strain, covering more than half of its TFs. This impressive resource enabled comprehensive systems-level analysis of the TF hierarchy.

      - This study identified novel gene regulation and function with validations through biochemical and genetic experiments.

      - The authors attempted on broad analyses including comparisons between different bacterial strains, providing further insights into the diversity and conservation of gene regulatory mechanisms.

      Thank you for your positive comments.

      Weaknesses:

      (1) Some conclusions are not backed by quantitative or statistical analyses, and they are sometimes overinterpreted.

      Thank you for your comments. We used hypergeometric test in this analysis. Although only one gene was enriched in some pathways, the adjusted p-value was less than 0.05. We added the details in the revised manuscript.

      (2) Some figures and analyses are not well explained, and I was not able to understand them.

      Thank you for your comments, and we are sorry for the confusion. We defined ‘indirect interaction’ as ‘co-association’ and ‘cooperativity’ as ‘if the common target of two TFs is from a TF’. We added the definition of "indirect interaction" and "cooperativity" in the revised legend.

      For Figure S3a, the low co-association scores and large peak numbers of these top-level TFs indicated that top-level TFs preferred to solely regulate target genes, but not to co-regulate with other top-level TFs. PSPPH4700 was an example to show that top-level TFs with low co-association scores and large peak numbers tend to solely regulate target genes, but not to co-regulate with other top-level TFs. We revised the sentence to ‘For example, the top-level TF PSPPH4700 yielded over 1,700 peaks but cooperated with only 24 top-level TFs with low co-association scores about 0.05 (Supplementary Table 2b).’.

      We analyzed high co-association scores of 125 TFs in three levels and further determined the co-association patterns. To identify the tendency of co-association of all these 125 TFs, the co-association patterns were classified into 4 clusters. Bottom-level TFs tend to co-regulate target genes with other TFs. We revised the sentence in the revised manuscript.

      For Figure 2b, in C1, C2 and C4, many bottom-level TFs performed co-association pattern with other TFs, especially bottom TFs (showed in C4). To explore the regulatory pattern in C3, the peak locations in target genes of MexT were analyzed with those of TFs in C3. Seven top-level TFs (PSPPH1435, PSPPH1758, PSPPH2193, PSPPH2454, PSPPH4638, PSPPH4998 and PSPPH3411), three middle-level TFs (PSPPH1100, PSPPH5132 and PSPPH5144) and four bottom-level TFs (PSPPH0700, PSPPH2300, PSPPH2444 and PSPPH2580) were compared with MexT. MexT showed higher co-association scores (more than 60 scores) with more top-level-TFs. Therefore, we demonstrated that MexT performed closer co-association relationships with top-level TFs. We added the statement in the revised manuscript.

      For Figure 1a, the hierarchical network showed different number of TFs in three levels (54 top-level TFs, 62 middle-level TFs and 147 bottom-level TFs), which indicated that more than half of TFs (bottom-level TFs) tend to be regulated by other TFs and then directly bound to target genes. This finding showed a downward regulatory direction of transcription regulation in P. syringae. We revised the statement in the revised manuscript.

      (3) The Method section lacks depth, especially in data analyses. It is strongly recommended that the authors share their analysis codes so that others can reproduce the analyses.

      Thank you for your comments, and we defined the intergenic region before each TF sequence as the promoter region. As pHM1 plasmid carries its own constitutive promoter (lacZ promoter), we amplified the TF-coding sequence and cloned into site following the promoter. The TF protein expression was activated by the promoter of plasmid. Psph 1448A was used for our main ChIP-seq. We added the details in the revised manuscript.

      For Figure S3, we performed GO analysis on genes that were co-bound by TF pairs. We added the details in the revised manuscript.

      We shared our analysis codes on the website (https://github.com/dengxinb2315/PS-PATRnet-code) in the Data Availability.

      Recommendations for the authors

      Reviewer #1 (Recommendations For The Authors):

      (1) The specific strain of Pseudomonas syringae used in the study outside of the evolutionary analysis should be specified in the abstract and main text.

      Thank you for your suggestion. We revised the statements in abstract and main text to specific strains.

      (2) The language used throughout the manuscript should be revised for clarity, conciseness, and readability.

      Thank you for your suggestion. We have revised the language used throughput the manuscript by a scientific editor who is a native speaker of English.

      (2) Line 688: Replace "80C" with "-80C".

      Thank you for your correction. We revised ‘80℃’ to ‘-80℃’. Please see Line 713.

      (3) Line 172 - 173: The abbreviations TT, MM, BB, TM, TB, and MB need to be expanded in the main text before their use.

      Thank you for your suggestion. We added the abbreviations TT, MM, BB, TM, TB, and MB in the manuscript. Please see Lines 172-174.

      Reviewer #2 (Recommendations For The Authors):

      Major points

      (1) The name of the P. syringae strains used in each experiment/analysis should be explicitly stated (most experiments were carried out with P. syringae strain 1448A). This should also be applied to the introduction where many papers on P. syringae are cited without clear indication of strain names. I think this amendment is essential because target genes and thus biological functions of TFs could be different between P. syringae strains, as shown in the present study.

      Thank you for your suggestion. We revised the P. syringae strains in the citations throughout the manuscript.

      (2) How many TFs were analyzed throughout the study? Most sentences including line 22 in the abstract say 170, but I also found some say 270 (for example, line 106 and line 149). The legend of Figure 1 says 262. More detailed information is required regarding the datasets used for each analysis.

      Thank you for your suggestion. The number of TFs analyzed by ChIP-seq in this research is 170, the number of TFs analyzed by HT-SELEX in our previous research is 100. Hierarchical analysis integrated data from ChIP-seq and HT-SELEX which included 270 TFs. As 8 TFs did not show hierarchical characteristic, the legend of Figure 1 said 262 TFs. We added the data source in the revised manuscript. Please see Lines 104, 147, 160 and 1082.

      (3) Figure 1b: Please define "indirect interaction" and "cooperativity" in the legend as well as in the text. I only found the definition of "direct interaction".

      Sorry for the missing information. We defined ‘indirect interaction’ and ‘cooperativity’ as ‘co-association’ and ‘if the common target of two TFs is from a TF’, respectively. We added the definition of "indirect interaction" and "cooperativity" in the revised legend. Please see Lines 174-176, 1084-1086.

      (4) I found it very interesting that conserved TFs show different repertoires of target genes in different P. syringae strains. This suggests the rewiring of transcriptional regulatory networks in P. syringae strains, but the underlying mechanism is not explored in the current manuscript. It can be easily tested whether these conserved TFs bind to similar or different motifs by motif enrichment analysis. If they bind to similar motifs, it is possible that the promoter sequences of their target genes have diversified. Addressing or at least discussing these points would provide molecular insights into the diversification of the transcriptional regulatory networks in P. syringae. Similarly, functional enrichment analysis of target genes can be used to test whether the conserved TFs regulate different biological processes.

      Thank you for your suggestion. We added the motif analysis and functional enrichment analysis of target genes of TFs (PSPPH3122 and PSPPH4127) in different P. syringae strains. We found two different motifs (AGACN4GATCAA and CGGACGN3GATCA) in 1448A and DC3000 strains, respectively. We also performed the GO analysis and found the specific functions of PSPPH3122 in Psph 1448A compared with Pst DC3000 and Pss B728a strains, including recombinase activity and DNA recombination. For PSPPH4127, we found four different motifs in four P. syringae strains. GO analysis showed its relationship with recombinase activity in Psph 1448A strain, and RNA binding, structural constituent of ribosome, translation and ribosome in Pss B728a strain. These results indicated the highly functional diversity of TFs in P. syringae. We added these points in the Results part, and Figure S9-S10 in the revised manuscript. Please see Lines 497-509.

      (5) Related to point 4, it would be quite useful if a list of orthologous genes of 1448A TFs in the other tested P. syringae strains were provided. Such information may also enhance the utility of the database developed in this study.

      Thank you for your suggestion. We added the list of orthologous genes of 301 Psph 1448A TFs in the other tested P. syringae strains in the Supplementary Table 5. Please see Lines 467 and Supplementary Table 5.

      (6) Lines 243-246: It is unclear how these functional enrichment analyses were performed. Did you use target genes regulated by individual TFs or those coregulated by pairs of TFs? Please add more information for the sake of readers.

      Thank you for your suggestion. We performed the functional enrichment analyses by hypergeometric test (BH-adjusted p < 0.05) via using target genes regulated by individual TFs. We added the details in the Results part. Please see Lines 248-252, 270, 1194-1195, 1199-1200 and 1205-1206.

      Minor points

      (1) Lines 167-168: I may not understand correctly, but you might want to say "downward-pointing edges" instead of "upward-pointing edges".

      Thank you for correction. We revised the ‘upward-pointing edges’ to ‘downward-pointing edges’. Please see Line 166.

      (2) Line 174: "physical interactions" should be amended to "direct interactions".

      Thank you for correction. We revised the ‘physical interactions’ to ‘direct interactions’. Please see Line 177.

      (3) Line 224: Could you please explain why bacterial growth in plant tissues is considered an example of "multi-stability"?

      Thank you for your suggestion. We are sorry for the incorrect statement. We showed ‘plant intercellular spaces’ as ‘multi-stability’. We revised the sentence to ‘These auto-regulators are important and always act as repressors in scenarios of multi-stability, such as plant intercellular spaces’. Please see Lines 224-226.

      (4) Line 254-257: Here, the definition of "tether binding" is introduced, but it is not very clear to me. In my understanding, tethered binding is an indirect binding of a TF to a target gene through protein-protein interaction with other TF that directly binds to the promoter of the target gene.

      Thank you for your suggestion, and we agree with you. We referred to the paper published in 2012 (Wang et al., 2012) and revised the statement of ‘tether binding’ to ‘This finding suggested that these TFs indirectly regulated target genes through protein-protein interaction with other TFs that directly binds to the promoters of target genes, a phenomenon defined as tethered binding’. Please see Lines 259-262.

      (5) Lines 341-343: Figure 3b shows qRT-PCR of hopAE1, not hrpR.

      Thank you for your correction. We revised ‘hrpR’ to ‘hopAE1’. Please see Line 349.

      (6) Lines 500 and Figure 6b: It is hard to see edges from module 12 to others. So, it would be better to provide numeric information (number of TFs and target genes) in the text.

      Thank you for your suggestion. Module 12 includes 22 TFs and 318 target genes. We added the statement of numeric information about Module 12 in the revised manuscript. Please see Lines 536-537.

      (7) Line 519: Figure S4b is not the EMSA data for PSPPH3798. Should it be Figure S4e?

      Thank you for your correction. We revised to ‘Figure S4e’. Please see Line 545.

      (8) Line 522: Figure S6b is not relevant to the statement here.

      Thank you for your correction. We deleted the ‘Figure S6b’ here. Please see Line 547.

      (9) Line 593: prokaryotic transcriptional regulatory networks -> eukaryotic transcriptional regulatory networks?

      Thank you for your correction. We revised ‘prokaryotic transcriptional regulatory networks’ to ‘eukaryotic transcriptional regulatory networks’. Please see Line 618.

      (10) Figure S3 requires images of higher resolution. Especially, values for the color codes are not readable or very hard to see.

      Thank you for your suggestion. To make the images clearer, we enlarged the images, change the color codes, and divided it into three figures. Please see the revised Figures S3-S5 and corresponding Figure legends at Lines 1191-1206.

      Reviewer #3 (Recommendations For The Authors):<br /> (1) Some conclusions are not backed by quantitative or statistical analyses, and they are sometimes overinterpreted.

      L221: "Taken together, the simplest and most effective submodule M1 and the coregulatory submodule M13 played crucial roles in the transcriptional regulation of TFs in P. syringae."

      The authors did not provide any evidence supporting the functional importance of any of these submodules. M13 is most enriched within the locked loop, but its size is much smaller than simple loops. What evidence supports the importance of this particular submodule?

      Thank you for your suggestion. In eukaryote (Saccharomyces cerevisiae) and prokaryote (Escherichia coli) which have the best characterized transcriptional regulation networks, the feed-forward loop (called M13 in this article) appear numerous times in the networks and perform different biological functions. M1 appeared most frequently by an order of magnitude than other modules. We revised the sentence to ‘Taken together, the most numerous but simplest submodule M1 played a crucial role in the transcriptional regulation of TFs in P. syringae.’ Please see Lines 222-224.

      L223: "...we found 92 auto-regulators...These auto-regulators are important and always act as repressors in scenarios of multi-stability, such as in plant intercellular spaces where bacteria grow (Figure 1d)(Alon, 2007). These regulators are regarded as bistable switches that further influence the expression of downstream genes."<br /> Are these claims supported by any evidence?

      Thank you for your suggestion. We referred to the following articles:

      (1) Alon. Nature Reviews Genetics. 2007(Alon, 2007).

      That transcription factors repress the transcription of their target genes was considered as negative regulation. These negative autoregulators account for half of the repressors in E. coli and occur in many eukaryotes. The repressors controlled the concentration of the target production through suppressing its expression, which accelerated back to the steady state of cells.

      (2) Becskei. et al. Nature. 2000; Rosenfeld et al. Journal of Molecular Biology. 2002 (Becskei & Serrano, 2000; Rosenfeld, Elowitz, & Alon, 2002).

      Fluorescent assay confirmed that the negative autoregulatory module (negative autoregulator TetR) spent less time to the log phase than unregulated group, which reduced cell-to-cell fluctuations in the steady-state level of the transcription factor. Some negative autoregulators were showed here, such as LexA, CysB and SrlA-D.

      In our research, we also identified many autoregulators including CysB and LexA2 (annotated as LexA repressor). We revised the sentence to ‘In addition, we found 92 auto-regulators in our hierarchy network. These auto-regulators are important and always act as repressors in scenarios of multi-stability, such as plant intercellular spaces (Figure 1d) (Alon, 2007). For example, LexA and CysB as negative autoregulators were indicated to reduce cell-to-cell fluctuations in the steady-state level of the transcription factor (Becskei & Serrano, 2000; Rosenfeld et al. 2002).’. Please see Lines 224-229.

      L265: "This finding indicated that the bottom-level TFs, which were more easily regulated, tended to cooperate with downstream genes and other intra-level TFs."<br /> Could the authors provide more explanation to reach this conclusion from the data? Analyzing the number of highly co-accessing TFs does not sufficiently support this conclusion. The clustering of TFs (C1-C4) is incomplete, and each TF level (Top/Middle/Bottom) contains different numbers of TFs. Since the authors calculated all-by-all co-association scores for these 125 TFs, they can group these scores into 6 possible combinations (TT, TM, TB, MM, MB, BB) and show the distribution of co-association scores.

      Thank you for your suggestion. We indicated that the bottom-level TFs preferred to regulate the target genes through the cooperation with other TFs. To further support the claim, we analyzed the proportion of the bottom TF interaction in all the TF pairs interactions and direct interaction based on results in Figure 1B. The interactions of bottom TFs were 43% and 49%, respectively. However, the interactions of top TFs and middle TFs were only 20% and 28%, respectively. We revised the statement ‘Based on the analysis in Figure 1B, we found that the proportions of bottom-level TF interaction in all the TF pair interactions and direct interaction were 43% and 49%. These results indicated that the bottom-level TFs tended to regulate downstream genes through cooperating with other level TFs.’ in the revised manuscript. Please see Lines 269-272.

      As not every TF performed co-association with other TFs, we only collected 125 TFs with co-association scores. For the numbers of TF in each level, we divided TFs into three levels according to hierarchy height. Hierarchy height from -1 to -0.3 represented bottom level; hierarchy height from -0.3 to 0.3 represented middle level ; hierarchy height from 0.3 to 1 represents top level. Each level was equally divided by height scores. We suggested that different numbers of TFs in three levels indicated the characteristic of transcriptional regulation in P. syringae.

      Thank you for your suggestion. As the co-association patterns were determined by co-association scores of the same TFs, we first grouped the co-association scores into 3 possible TF pairs (TT, MM, and BB, in Figures S3a, S4a and S5a). Our results indicated that higher co-association scores preferred to occur in bottom-level TFs. We revised the statement in the revised manuscript. Please see Lines 244-252.

      (2) Some figures and analyses are not well explained, and I was not able to understand them.

      Figure 1b: The terms "direct," "indirect," and "cooperativity" require further clarification as their definitions in the text (L169-183) are unclear to me. This ambiguity hampers the evaluation of the authors' discussion regarding TF-TF interactions (L561-584), an important theme of this study. The figure includes concepts discussed in later sections (e.g., cooperativity), making it difficult to understand. A diagram explaining these concepts would be highly helpful for readers to understand.

      Sorry for the missing information. We defined ‘indirect interaction’ as ‘co-association’, ‘cooperativity’ as ‘if the common target of two TFs is from a TF’. We added the definition of "indirect interaction" and "cooperativity" in the revised manuscript and legend. Please see Lines 174-176 and 1085-1087.

      L253: "Notably, we found that TFs at the top level, without cooperating TFs, exhibited a large number of binding peaks (Figure S3a)."

      I could not understand this sentence. Did the authors mean that top-level TFs with a large number of peaks showed a low level of co-association? If so, does this data suggest that these TFs do not tend to cooperate with other TFs? I was confused by the discussion in L253-L261.

      Thank you for your comment, and we agree with you. The low co-association scores and large peak numbers of these top-level TFs indicated that top-level TFs preferred to solely regulate target genes, but not to co-regulate with other top-level TFs.

      Thank you for your comment. From L253-256, PSPPH4700 was an example to show that top-level TFs with low co-association scores and large peak numbers tend to solely regulate target genes, but not to co-regulate with other top-level TFs. We revised the sentence to ‘For example, the top-level TF PSPPH4700 yielded over 1,700 peaks, but cooperated with only 24 top-level TFs with low co-association scores about 0.05 (Supplementary Table 2b).’.

      From L257-261, we analyzed high co-association scores of 125 TFs in three levels and further determined the co-association patterns. To identify the tendency of co-association of all these 125 TFs, the co-association patterns were classified into 4 clusters. Bottom-level TFs tend to co-regulate target genes with other TFs. We revised the sentence. Please see Lines 262-264, 265-266 and 269-272.

      L287: "The analysis of the peak locations of MexT demonstrated that MexT showed closer co-association relationships with top-level TFs (Figure 2b)."

      I could reach this conclusion by seeing Figure 2b. Additional explanation and/or data visualization would be appreciated.

      Thank you for your suggestion. In C1, C2 and C4, many bottom-level TFs performed co-association pattern with other TFs, especially bottom TFs (showed in C4). To explore the regulatory pattern in C3, the peak locations in target genes of MexT were analyzed with those of TFs in C3. Seven top-level TFs (PSPPH1435, PSPPH1758, PSPPH2193, PSPPH2454, PSPPH4638, PSPPH4998 and PSPPH3411), three middle-level TFs (PSPPH1100, PSPPH5132 and PSPPH5144) and four bottom-level TFs (PSPPH0700, PSPPH2300, PSPPH2444 and PSPPH2580) were compared with MexT. MexT showed higher co-association scores (more than 60 scores) with more top-level-TFs. Therefore, we demonstrated that MexT performed closer co-association relationships with top-level TFs. We added the statement in the revised manuscript. Please see Lines 291-296.

      Figure 6cd: What kind of enrichment analysis did the authors perform? Was any statistical test used? The figure only shows the number of genes, and sometimes the number is only 1 for a functional category. Can it be considered as significant enrichment?

      Thank you for your comment. We used hypergeometric test in this analysis. Although only one gene was enriched in some pathways, the adjusted p-value was less than 0.05. We added the details in the revised manuscript. Please see Lines 533-534.

      L169: "The hierarchical network revealed a downward information flow, suggesting the prioritization of collaboration between different hierarchy levels."<br /> Can the authors please explain the logic behind this statement more in detail?

      Thank you for your comment. The hierarchical network showed different number of TFs in three levels (54 top-level TFs, 62 middle-level TFs and 147 bottom-level TFs), which indicated that more than half of TFs (bottom-level TFs) tend to be regulated by other TFs and then directly bound to target genes. This finding showed a downward regulatory direction of transcription regulation in P. syringae. We revised the statement in the revised manuscript. Please see Lines 167-170.

      (3) The Method section lacks depth, especially on data analyses.

      How did the authors define promoter regions of each gene? How were operons treated in their analyses? Was P. syringae 1448A used for their main ChIP-seq?

      Thank you for your comment. We defined the intergenic region before each TF sequence as the promoter region.

      As pHM1 plasmid carries its own constitutive promoter (lacZ promoter), we amplified the TF-coding sequence and cloned into the site following the promoter. The TF protein expression was activated by the promoter of plasmid.

      P. syringae 1448A was used for our main ChIP-seq. We added the details in the revised manuscript. Please see Lines 705 and 727-730.

      Figure S3: I am not sure how the GO analyses were done. For example, in the case of the top-level TF PSPPH4700, did the authors perform GO analysis on genes that are co-bound by PSPPH4700 and any other top-level TFs?

      Thank you for your comment and we agree with you. We performed GO analysis on genes that were co-bound by TF pairs in the same level. We added the details in the revised manuscript. Please see Lines 248-252.

      The analysis presented in Figure 6a needs more explanation of the methodology employed by the authors.

      Thank you for your comment. We added more details for the analysis in Figure 6a. Please see Lines 514-522.

      It is strongly recommended that the authors share their analysis codes so that others can reproduce the analyses.

      Thank you for your comment. We shared our analysis codes on the website (https://github.com/dengxinb2315/PS-PATRnet-code) in the Data Availability. Please see Lines 800-801.

      (4) Other:

      Figure 3: I suggest putting additional panel labels to facilitate the interpretation of the figure.

      Thank you for your suggestion. We added detailed labels in the revised Figures 3 and 4. Please see in the revised Figures 3 and 4.

      I spotted several potential errors:

      L106: 170 TFs?

      Thank you for your comment, and we are sorry for the missing details. For the hierarchical network, we integrated the DNA-binding data of 170 TFs in this study and 100 TFs in our previous SELEX research. We added the details in the revised manuscript. Please see Lines 104, 147 and 159-160.

      L592: P. syringae not E. coli?

      Thank you for your comment. Here we discussed the hierarchical characteristics in E. coli. We revised the statement in the revised manuscript. Please see Line 618.

      L593: eukaryotic not prokaryotic?

      Thank you for your correction. Here we discussed the feedforward loops in our study. We revised the statement in the revised manuscript. Please see Line 618.

      References

      Alon, U. (2007). Network motifs: theory and experimental approaches. Nature Reviews Genetics, 8(6), 450-461.

      Becskei, A., & Serrano, L. (2000). Engineering stability in gene networks by autoregulation. Nature, 405(6786), 590-593.

      Rosenfeld, N., Elowitz, M. B., & Alon, U. (2002). Negative autoregulation speeds the response times of transcription networks. Journal of molecular biology, 323(5), 785-793.

      Wang, J., Zhuang, J., Iyer, S., Lin, X., Whitfield, T. W., Greven, M. C., . . . Cheng, Y. (2012). Sequence features and chromatin structure around the genomic regions bound by 119 human transcription factors. Genome research, 22(9), 1798-1812.

    1. eLife assessment

      This useful study reports that the Drosophila transcription factor sisterless A (sisA) regulates the expression of Sex-lethal (Sxl) in female germ cells. The data supporting claims regarding the genetic requirement of sisA are convincing, but the characterization of the cis-regulatory elements controlling Sxl expression in the female germline is viewed as incomplete. The work will be of significant interest to colleagues studying reproductive biology and sex determination.

    2. Reviewer #1 (Public Review):

      Summary:

      In Drosophila melanogaster, expression of Sex-lethal (Sxl) protein determines sexual identity and drives female development. Functional Sxl protein is absent from males where splicing includes a termination codon-containing "poison" exon. Early during development, in the soma of female individuals, Sxl expression is initiated by an X chromosome counting mechanism that activates the Sxl establishment promoter (SxlPE) to produce an initial amount of Sxl protein. This then suppresses the inclusion of the "poison" exon, directing the constructive splicing of Sxl transcripts emerging from the Sxl maintenance promotor (SxlPM) which is activated at a later stage during development irrespective of sex. This autoregulatory loop maintains Sxl expression and commits to female development.

      Sxl also determines the sexual identity of the germline. Here Sxl expression generally follows the same principles as in somatic tissues, but the way expression is initiated differs from the soma. This regulation has so far remained elusive.

      In the presented manuscript, Goyal et al. show that activation of Sxl expression in the germline depends on additional regulatory DNA sequences, or sequences different from the ones driving initial Sxl expression in the soma. They further demonstrate that sisterless A (sisA), a transcription factor that is required for activation of Sxl expression in the soma, is also necessary, but not sufficient, to initiate the expression of functional Sxl protein in female germ cells. sisA expression precedes Sxl induction in the germline and its ablation by RNAi results in impaired expression of Sxl, formation of ovarian tumors, and germline loss, phenocopying the loss of Sxl. Intriguingly, this phenotype can be rescued by the forced expression of Sxl, demonstrating that the primary function of sisA in the germline is the induction of Sxl expression.

      Strengths:

      The clever design of probes (for RNA FISH) and reporters allowed the authors to dissect Sxl expression from different promoters to get novel insight into sex-specific gene regulation in the germline. All experiments are carefully controlled. Since Sxl regulation differs between the soma and the germline, somatic tissues provide elegant internal controls in many experiments, ensuring e.g. functionality of the reporters. Similarly, animals carrying newly generated alleles (e.g. genomic tagging of the Sxl locus) are fertile and viable, demonstrating that the genetic manipulation does not interfere with protein function. The conclusions drawn from the experimental data are sound and advance our understanding of how Sxl expression is induced in the female germline.

      Weaknesses:

      The assays employed by the authors provide valuable information on when Sxl promoters become active. However, since no information on the stability of the gene products (i.e. RNA and protein) is available, it remains unclear when the SxlPE promoter is switched off in the germline (conceptually it only needs to be active for a short time period to initiate production of functional Sxl protein). As correctly stated by the authors, the persisting signals observed in the germline might therefore not reflect the continuous activity of the SxlPE promoter.

      Mapping of regulatory elements and their function: SxlPE with 1.5 kb of flanking upstream sequence is sufficient to recapitulate early Sxl expression in the soma. The authors now provide evidence that beyond that, additional DNA sequences flanking the SxlPE promoter are required for germline expression. However, a more precise mapping was not performed. Also, due to technical limitations, the authors could not precisely map the sisA binding sites. Since this protein is also involved in the somatic induction of Sxl, its binding sites likely reside in the region 1.5kb upstream of the SxlPE promoter, which has been reported to be sufficient for somatic regulation. The regulatory role of the sequences beyond SxlPE-1.5kb therefore remains unaddressed and it remains to be investigated which trans-acting factor(s) exert(s) its/their function(s) via this region.

      The central question of how Sxl expression is initiated and controlled in the germline still remains unanswered. Since sisA is zygotically expressed in both the male and the female germline (Figure 4D), it is unlikely the factor that restricts Sxl expression to the female germline.

      How does weak expression of Sxl in male tissues or expression above background after knockdown of sisA reconcile with the model that an autoregulatory feedback loop enforces constant and clonally inheritable Sxl expression once Sxl is induced? Is the current model for Sxl expression too simple or are we missing additional factors that modulate Sxl expression (such as e.g. Sister of Sex-lethal)? While I do not expect the authors to answer these questions, I would expect them to appropriately address these intriguing aspects in the discussion.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors wanted to determine whether cis-acting factors of Sxl - two different Sxl promoters in somatic cells - regulate Sxl in a similar way in germ cells. They also wanted to determine whether trans-acting factors known to regulate Sxl in the soma also regulate Sxl in the germline.

      Regarding the cis-acting factors, they examine the Sxl "establishment promoter" (SxlPE) that is activated in female somatic cells by the presence of two X chromosomes. Slightly later in development, dosage compensation equalizes X chromosome expression in males and females and so X chromosomes can no longer be counted. The second Sxl promoter is the "maintenance promoter," (SxlPM), which is activated in both sexes. The mRNA produced from the maintenance promoter has to be alternatively splicing from early Sxl protein generated earlier in development by the PE. This leads to an autoregulatory loop that maintains Sxl expression in female somatic cells. The authors used fluorescent in situ hybridization (FISH) with oligopaints to determine the temporal activation of the PE or PM promoters. They find that - unlike the soma - the PE does not precede the PM and instead is activated contemporaneously or later than the PM - this is confusing with the later results (see below). Next, they generated transcriptional reporter constructs containing large segments of the Sxl locus, the 1.5 kb used in somatic studies, a 5.2 kb reporter, and a 10.2 kb. Interestingly the 1.5 kb reporter that was reported to recapitulate Sxl expression in soma and germline was not observed by the authors. The 5.2 kb reporter was observed in female somatic cells but not in germ cells. Only when they include an additional 5 kb downstream of the 5.2 kb reporter (here the 10.2 kb reporter) they did see expression in germ cells but this occurred at the L1 stages. Their data indicate that Sxl activity in the germ requires different cis-regulation than the soma and that the PE is activated later in germ cells than in somatic cells. The authors next use gene editing to insert epitope tags in two distinct strains in the hopes of creating an early Sxl and a later Sxl protein derived from the PE and PM, respectively. The HA-tagged protein from the PE was seen in somatic cells but never in the germline, possibly due to very low expression. The FLAG-tagged late Sxl protein is observed in L2 germ cells. Because the early HA-Sxl protein is not perceptible in germ cells, it is not possible to conclude its role in the germline. However, because late FLAG-Sxl was only observed in L2 germ cells and the PE was detected in L1, this leaves open the possibility that PE produces early HA-Sxl (which currently cannot be detected), which then alternatively splices the transcript from the PM. In other words, the soma and germline could have a similar temporal relationship between the two Sxl promoters. While I agree with the authors about this conclusion, the earlier work with the oligopaints leads to the conclusion that SE is active after PM. This is confusing.

      Next, the authors wanted to turn their attention to the trans-acting factors that regulate Sxl in the soma, including Sisterless A (SisA), SisB, Runt, and the JAK/STAT ligand Unpaired. Using germline RNAi, the authors found that only knockdown of SisA causes ovarian tumors, similar to the loss of Sxl, suggesting that SisA regulates Sxl (ie the PE) in both the soma and the germline. They generated a SisA null allele using CRISPR/Cas9 and these animals had ovarian tumors and germ cell-less ovaries. FISH revealed that sisA is activated in primordial germ cells in stages 3-6 before the activation of Sxl. They used CRISPR-Cas9 to generate an endogenously-tagged SisA and found that tagged SisA was expressed in stage 3-6 PCGs, which is consistent with activating PE in the germline. They showed that sisA is upstream of Sxl as germline depletion of sisA led to a significant decrease in expression from the 10.2 kb PE reporter and in SXL protein. The authors could rescue the ovarian tumors and loss of Sxl protein upon germline depletion of sisA by supplying Sxl from another protein (the otu promoter). These data indicate that sisA is necessary for Sxl activation in the germline. However, ectopic sisA in germ cells in the testis did not lead to ectopic Sxl, suggesting that sisA is not sufficient to activate Sxl in the germline.

      Strengths:

      (1) The genetic and genomic approaches in this study are top-notch and they have generated reagents that will be very useful for the field.

      (2) Excellent use of powerful approaches (oligo paint, reporter constructs, CRISPR-Cas9 alleles).

      (3) The combination of state of art approaches and quantification of phenotypes allows the authors to make important conclusions.

      Weaknesses:

      (1) Confusion in line 127 (this indicates that SxlPE is not activated before SxlPM in the germline) about PE not being activated before the PM in the germline when later figures show that PE is activated in L1 and late Sxl protein is seen in L2. It would be helpful to the readers if the authors edited the text to avoid this confusion. Perhaps more explanation of the results at specific points would be helpful.

    4. Reviewer #3 (Public Review):

      Summary:

      The mechanisms governing the initial female-specific activation of Sex-lethal (Sxl) in the soma, the subsequent maintenance of female-specific expression and the various functions of Sxl in somatic sex determination and dosage compensation are well documented. While Sxl is also expressed in the female germline where it plays a critical role during oogenesis, the pathway that is responsible for turning Sxl on in germ cells has been a long-standing mystery. This manuscript from Goyal et al describes studies aimed at elucidating the mechanism(s) for the sex-specific activation of the Sex-lethal (Sxl) gene in the female germline of Drosophila.

      In the soma, the Sxl establishment promoter, Sxl-Pe, is regulated in pre-cellular blastoderm embryos in somatic cells by several X-linked transcription factors (sis-a, sis-b, sis-c and runt). At this stage of development, the expression of these transcription factors is proportional to gene dose, 2x females and 1x in males. The cumulative two-fold difference in the expression of these transcription factors is sufficient to turn Sxl-Pe on in female embryos. Transcripts from the Sxl-Pe promoter encode an "early" version of the female Sxl protein, and they function to activate a splicing positive autoregulatory loop by promoting the female-specific splicing of the initial pre-mRNAs derived from the Sxl maintenance promoter, Sxl-Pm (which is located upstream of Sxl-Pm). These female Sxl-Pm mRNAs encode a Sxl protein with a different N-terminus from the Sxl-Pe mRNAs, and they function to maintain female-specific splicing in the soma during the remainder of development.

      In this manuscript, the authors are trying to understand how the Sxl-Pm positive autoregulatory loop is established in germ cells. If Sxl-Pe is used and its activation precedes Sxl-Pm as is true in the soma, they should be able to detect Sxl-Pe transcripts in germ cells before Sxl-Pm transcripts appear. To test this possibility, they generated RNA FISH probes complementary to the Sxl-Pe first exon (which is part of an intron sequence in the Sxl-Pm transcript) and to a "common sequence" that labels both Sxl-Pe and Sxl-Pm transcripts. Transcripts labeled by both probes were detected in germ cells beginning at stage 5 (and reaching a peak at stage 10), so either the Sxl-Pm and Sxl-Pe promoters turn on simultaneously, or Sxl-Pe is not active.

      They next switched to Sxl-Pe reporters. The first Sxl-Pe:gfp reporter they used has a 1.5 kb upstream region which in other studies was found to be sufficient to drive sex-specific expression in the soma of blastoderm embryos. Also like the endogenous Sxl gene it is not expressed in germ cells at this early stage. In 2011, Hashiyama et al reported that this 1.5 kb promoter fragment was able to drive gfp expression in Vasa-positive germ cells later in development in stage 9/10 embryos. However, because of the high background of gfp in the nearby soma, their result wasn't especially convincing. Though they don't show the data, Goyal et al indicated that unlike Hashiyama et al they were unable to detect gfp expressed from this reporter in germ cells. Goyal et al extended the upstream sequences in the reporter to 5 kb, but they were still unable to detect germline expression of gfp.

      Goyal et al then generated a more complicated reporter which extends 5 kb upstream of the Sxl-Pe start site and 5 kb downstream-ending at or near 4th exon of the Sxl-Pm transcript (the Sxl-Pe10 kb reporter). (The authors were not explicit as to whether the 5 kb downstream sequence extended beyond the 4th exon splice junction-in which case splicing could potentially occur with an upstream exon(s)-or terminated prior to the splice junction as seems to be indicated in their diagram.) With this reporter, they were able to detect sex-specific gfp expression in the germline beginning in L1 (first instar larva). With the caveat that gfp detection might be delayed compared to the onset of reporter activation, these findings indicated that the sequences in the reporter are able to drive sex-specific transcription in the germline at least as early as L1.

      The authors next tagged the N-terminal end of the Sxl-Pe protein with HA (using Crispr/Cas9) and the N-terminal end of Sxl-Pm protein with Flag. They report that the HA-Sxl-Pe protein is first detected in the soma at stage 9 of embryogenesis. Somatic HA-Sxl-Pe protein persists into L1, but is no longer detected in L2. However, while somatic HA-Sxl-Pe protein is detected, they were unable to detect HA-Sxl-Pe protein in germ cells. In the case of FLAG-Sxl-Pm, it could first be detected in L2 germ cells indicating that at this juncture the Sxl-positive autoregulatory loop has been activated. This contrasts with Sxl-Pm transcripts which are observed in a few germ cells at stage 5 of embryogenesis, and in most germ cells by stage 10. The authors propose (based on the expression pattern of the Sxl-Pe10kb reporter and the appearance of Flag-Sxl-Pm protein) that Sxl-Pe comes on in germ cells in L1, and that the Sxl-Pe protein activates the female splicing of Sxl-Pm transcripts, giving detectable Flag-Sxl-Pm proteins beginning in L2.

      To investigate the signals that activate Sxl-Pe in germ cells, the authors tested four of the X-linked genes (sis-a, sis-b, sis-c, and runt) that function to activate Sxl-Pe in the soma in early embryos. RNAi knockdown of sis-b, sis-c, and runt had no apparent effect on oogenesis. In contrast, knockdown of sis-a resulted in tumorous ovaries, a phenotype associated with Sxl mutations. (Three different RNAi transgenes were tested-two gave this phenotype, the third did not.) Sxl-Pe10kb reporter activity in L1 female germ cells is also dependent on sis-A.

      Several approaches were used to confirm a role for sis-a in a) oogenesis and b) the activation of the Sxl-Pm autoregulatory loop. They showed that sis-a germline clones (using tissue-specific Crispr/Cas9 editing) resulted in the tumorous ovary phenotype and reduced the expression of Sxl protein in these ovaries. They found that sis-a transcripts and GFP-tagged Sis-A protein are present in germ cells. Finally, they showed tumorous ovary phenotype induced by germline RNAi knockdown of sis-a can be partially rescued by expressing Sxl in the germ cells.

      Critique:

      While this manuscript addresses a longstanding puzzle - the mechanism activating the Sxl autoregulatory loop in female germ cells-and likely identified an important germline transcriptional activator of Sxl, sis-a, the data that they've generated doesn't make a compelling story. At every step, there are puzzle pieces that don't fit the narrative. In addition, some of their findings are inconsistent with many previous studies.

      (1) The authors used RNA FISH to time the expression of Sxl-Pe and Sxl-Pm transcripts in germ cells. Transcripts complementary to Sxl-Pe and Sxl-Pm were detected at the same time in embryos beginning at stage 5. This is not a definitive experiment as it could mean a) that Sxl-Pe and Sxl-Pm turn on at the same time, b) that Sxl-Pe comes on after Sxl-Pm (as suggested by the Sxl-Pe10kb reporter) or c) Sxl-Pe never comes on.

      (2) Hashiyama et al reported that they detected gfp expression in stage 9/10 germ cells from a 1.5 kb Sxl-Pe-gfp. As noted above, this result wasn't entirely convincing and thus it isn't surprising that Goyal et al were unable to reproduce it. Extending the upstream sequences to just before the 1st exon of Sxl-Pm transcripts also didn't give gfp expression in germ cells. Only when they added 5 kb downstream did they detect gfp expression. However, from this result, it isn't possible to conclude that the Sxl-Pe promoter is actually driving gfp expression in L1 germ cells. Instead, the Sxl promoter active in the germ line could be anywhere in their 10 kb reporter.

      (3) At least one experiment suggests that Sxl-Pe never comes on in germ cells. The authors tagged the N-terminus of the Sxl-Pe protein with HA and the N-terminus of the Sxl-Pm protein with Flag. Though they could detect HA-Sxl-Pe protein in the soma, they didn't detect it in germ cells. On the other hand, the Flag-Sxl-Pm protein was detected in L2 germ cells (but not earlier). These results would more or less fit with those obtained for the 10 kb reporter and would support the following model: Prior to L1, Sxl-Pm transcripts are expressed and spliced in the male pattern in both male and female germ cells. During L1, Sxl protein expressed via a mechanism that depends upon a 10 kb region spanning Sxl-Pe (but not on Sxl-Pe) is produced and by L2 there are sufficient amounts of this protein to switch the splicing of Sxl-Pm transcripts from a male to a female pattern-generating Flag-tagged Sxl-Pm protein.

      (4) The 10kb reporter is sex-specific, but not germline-specific. The levels of gfp in female L1 somatic cells are equal to if not greater than those in L1 female germ cells. That the Sxl-Pe10kb reporter is active in the soma complicates the conclusion that it represents a germ line-specific promoter. Germline activity is, however, sensitive to sis-A knockdowns which is plus. Presumably, somatic expression of the reporter wouldn't be sensitive to a (late) sis-A knockdown- but this wasn't shown.

      (5) Their results with the HA-Sxl-Pe protein don't fit with many previous studies-assuming that the authors have explained their results properly. They report that HA-Sxl-Pe protein is first detected in the soma at stage 9 of embryogenesis and that it then persists till L2. However, previous studies have shown that Sxl-Pe transcripts and then Sxl-Pe proteins are first detected in ~NC11-NC12 embryos. In RNase protection experiments, the Sxl-Pe exon is observed in 2-4 hr embryos, but not detected in 5-8 hr, 14-12 hr, L1, L2, L3, or pupae. Northerns give pretty much the same picture. Western blots also show that Sxl-Pe proteins are first detectable around the blastoderm stage. So it is not at all clear why HA-Sxl-Pe proteins are first observed at stage 9 which, of course, is well after the time that the Sxl-Pm autoregulatory loop is established.

      Given the obvious problems with the initial timing of somatic expression described here, it is hard to know what to make of the fact that HA-tagged Sxl-Pe proteins aren't observed in germ cells.

      As for the presence of HA-Sxl-Pe proteins later than expected: While RNase protection/Northern experiments showed that Sxl-Pe mRNAs are expressed in 2-4 hr embryos and disappear thereafter, one could argue from the published Western experiments that the Sxl-PE proteins expressed at the blastoderm stage persist at least until the end embryogenesis, though perhaps at somewhat lower levels than at earlier points in development. So the fact that Goyal et al were able to detect HA-Sxl-Pe proteins in stage 9 embryos and later on in L1 larva probably isn't completely unexpected. What is unexpected is that the HA-Sxl-Pe proteins weren't present earlier.

      (6) The authors use RNAi and germline clones to demonstrate that sis-A is required for proper oogenesis: when sis-A activity is compromised in germ cells, i) tumorous ovary phenotypes are observed and ii) there is a reduction in the expression of Sxl-Pm protein. They are also able to rescue the phenotypic effects of sis-a knockdown by expressing a Sxl-Pm protein. While the experiments indicating sis-a is important for normal oogenesis and that at least one of its functions is to ensure that sufficient Sxl is present in the germline stem cells seem convincing, other findings would make the reader wonder whether Sis-A is actually functioning (directly) to activate Sxl transcription from promoter X.

      The authors show that sis-a mRNAs and proteins are expressed in stage 3-5 germ cells (PGCs). This is not unexpected as the X-linked transcription factors that turn Sxl-Pe on are expressed prior to nuclear migration, so their protein products should be present in early PGCs. The available evidence suggests that their transcription is shut down in PGCs by the factors responsible for transcriptional quiescence (e.g., nos and pgc) in which case transcripts might be detected in only one or two PGC-which fits with their images. However, it is hard to believe that expression of Sis-A protein in pre-blastoderm embryos is relevant to the observed activation of the Sxl-Pm autoregulatory loop hours later in L2 larva.

      It is also not clear how the very low level of gfp-Sis-A seen in only a small subset of migrating germ cells in stage 10 embryos (Figure S6) would be responsible for activating the Sxl-Pe10kb reporter in L1. It seems likely that the small amount of protein seen in stage 10 embryos is left over from the pre-cellular blastoderm stage. In this case, it would not be surprising to discover that the residual protein is present in both female and male stage 10 germ cells. This would raise further doubts about the relevance of the gfp-Sis-A at these early stages.

      In fact, given the evidence presented implicating sis-a in activating Sxl, (the germline activation of the Sxl-Pe10kb reporter, the RNAi knockdowns, and the germ cell-specific sis-a clones) it is clear that the sis-A RNAs and proteins seen in pre-cellular blastoderm PGCs aren't relevant. The germline clone experiment (and also the RNAi knockdowns) indicates that sis-A must be transcribed in germ cells after Cas9 editing has taken place. Presumably, this would be after transcription is reactivated in the germline (~stage 10) and after the formation of the embryonic gonad (stage 14) so that the somatic gonadal cells can signal to the germ cells. With respect to the reporter, the relevant time frame for showing that sis-A is present in germ cells would be even later in L1.

      (7) As noted above, the data in this manuscript do not support the idea that Sxl-Pe proteins activate the Sxl-Pm female splicing in the germline. Flybase indicates that there is at least one other Sxl promoter that could potentially generate a transcript that includes the male exon but still could encode a Sxl protein. This promoter "Sxl-Px" is located downstream of Sxl-Pm and from its position it could have been included in the authors' 10 kb reporter. The reported splicing pattern of the endogenous transcript skips exon2, and instead links an exon just downstream of Sxl-Px to the male exon. The male exon is then spliced to exon4. If the translation doesn't start and end at one of the small upstream orfs in the exons close to Sxl-Px and the male exon, a translation could begin with an AUG codon in exon4 that is in frame with the Sxl protein coding sequence. This would produce a Sxl protein that lacks aa sequences from N-terminus, but still retains some function.

      Another possible explanation for how gfp is expressed from the 10 kb reporter is that the transcript includes the "z" exon described by Cline et al., 2010.

    5. Author response:

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      In Drosophila melanogaster, expression of Sex-lethal (Sxl) protein determines sexual identity and drives female development. Functional Sxl protein is absent from males where splicing includes a termination codon-containing "poison" exon. Early during development, in the soma of female individuals, Sxl expression is initiated by an X chromosome counting mechanism that activates the Sxl establishment promoter (SxlPE) to produce an initial amount of Sxl protein. This then suppresses the inclusion of the "poison" exon, directing the constructive splicing of Sxl transcripts emerging from the Sxl maintenance promotor (SxlPM) which is activated at a later stage during development irrespective of sex. This autoregulatory loop maintains Sxl expression and commits to female development. 

      Sxl also determines the sexual identity of the germline. Here Sxl expression generally follows the same principles as in somatic tissues, but the way expression is initiated differs from the soma. This regulation has so far remained elusive. 

      In the presented manuscript, Goyal et al. show that activation of Sxl expression in the germline depends on additional regulatory DNA sequences, or sequences different from the ones driving initial Sxl expression in the soma. They further demonstrate that sisterless A (sisA), a transcription factor that is required for activation of Sxl expression in the soma, is also necessary, but not sufficient, to initiate the expression of functional Sxl protein in female germ cells. sisA expression precedes Sxl induction in the germline and its ablation by RNAi results in impaired expression of Sxl, formation of ovarian tumors, and germline loss, phenocopying the loss of Sxl. Intriguingly, this phenotype can be rescued by the forced expression of Sxl, demonstrating that the primary function of sisA in the germline is the induction of Sxl expression. 

      Strengths: 

      The clever design of probes (for RNA FISH) and reporters allowed the authors to dissect Sxl expression from different promoters to get novel insight into sex-specific gene regulation in the germline. All experiments are carefully controlled. Since Sxl regulation differs between the soma and the germline, somatic tissues provide elegant internal controls in many experiments, ensuring e.g. functionality of the reporters. Similarly, animals carrying newly generated alleles (e.g. genomic tagging of the Sxl locus) are fertile and viable, demonstrating that the genetic manipulation does not interfere with protein function. The conclusions drawn from the experimental data are sound and advance our understanding of how Sxl expression is induced in the female germline. 

      Weaknesses: 

      The assays employed by the authors provide valuable information on when Sxl promoters become active. However, since no information on the stability of the gene products (i.e. RNA and protein) is available, it remains unclear when the SxlPE promoter is switched off in the germline (conceptually it only needs to be active for a short time period to initiate production of functional Sxl protein). As correctly stated by the authors, the persisting signals observed in the germline might therefore not reflect the continuous activity of the SxlPE promoter. 

      Mapping of regulatory elements and their function: SxlPE with 1.5 kb of flanking upstream sequence is sufficient to recapitulate early Sxl expression in the soma. The authors now provide evidence that beyond that, additional DNA sequences flanking the SxlPE promoter are required for germline expression. However, a more precise mapping was not performed. Also, due to technical limitations, the authors could not precisely map the sisA binding sites. Since this protein is also involved in the somatic induction of Sxl, its binding sites likely reside in the region 1.5kb upstream of the SxlPE promoter, which has been reported to be sufficient for somatic regulation. The regulatory role of the sequences beyond SxlPE-1.5kb therefore remains unaddressed and it remains to be investigated which trans-acting factor(s) exert(s) its/their function(s) via this region. 

      We agree that a more precise mapping of the essential elements within the 10.2 kb reporter is an important direction in which to proceed. Unfortunately, this is out of the scope of the current manuscript given current lab personnel. In regard to the 1.5 kb promoter that activates SxlPE in the soma, we do not feel that the Sisa binding sites are necessarily in this region. It is important to note that, while the 1.5 kb promoter is sufficient for female-specific expression in the soma, it may not contain all of the regulatory elements that normally regulate PE from the endogenous locus. Activation of PE in the soma is thought to be regulated by a combination of positive-acting factors (SisA, SisB, etc.) and repressive factors (e.g. Dpn) that set a threshold for PE activation. Much more work would need to be done to determine whether all of these factors bind to the 1.5 kb promoter, or whether additional sequences are also involved to control the proper timing and robustness of normal Sxl PE activation in the soma.

      The central question of how Sxl expression is initiated and controlled in the germline still remains unanswered. Since sisA is zygotically expressed in both the male and the female germline (Figure 4D), it is unlikely the factor that restricts Sxl expression to the female germline. 

      X chromosome “counting” elements like SisA are always expressed in both males and females, but it is thought that the 2X does of them in females activates PE, while the 1X does in males does not. Thus, we do expect SisA to be expressed in both males and females as we observed.

      How does weak expression of Sxl in male tissues or expression above background after knockdown of sisA reconcile with the model that an autoregulatory feedback loop enforces constant and clonally inheritable Sxl expression once Sxl is induced? Is the current model for Sxl expression too simple or are we missing additional factors that modulate Sxl expression (such as e.g. Sister of Sex-lethal)? While I do not expect the authors to answer these questions, I would expect them to appropriately address these intriguing aspects in the discussion. 

      It is difficult to know what is “background” and what is actual weak Sxl expression in males. We agree that, if it is real, then why it doesn’t activate autoregulation of the Sxl PM transcript is mysterious. And yes, the current model for female-specific expression of Sxl in the soma may well be incomplete. Sxl PM transcript is present in the testis based on community RNA-seq data and our own analysis of male vs. female bam-mutant gonads (PMID 31329582), but it is at lower levels. Whether the lower level in the testis is due to tissue differences or sex-specific regulation of RNA levels is unknown. Our observations that the HA-tagged Sxl Early protein remains present in somatic cells in L1 larvae, and that GFP expression from the 10.2 kb Sxl PE-GFP can be detected in the soma until L2 could either be due to perdurance of the protein products, or continued sex-specific expression of PE long after the time that it was thought to shut off. This is also long after dosage compensation should have equalized the expression of X chromosome gene expression, meaning that X chromosomes can no longer be “counted” by factors like SisA and SisB. Thus, sex-specific expression of PE at this time would require another mechanism besides the current model (such as feedback regulation of Sxl PE transcription from downstream factors).

      Reviewer #2 (Public Review): 

      Summary: 

      The authors wanted to determine whether cis-acting factors of Sxl - two different Sxl promoters in somatic cells - regulate Sxl in a similar way in germ cells. They also wanted to determine whether trans-acting factors known to regulate Sxl in the soma also regulate Sxl in the germline. 

      Regarding the cis-acting factors, they examine the Sxl "establishment promoter" (SxlPE) that is activated in female somatic cells by the presence of two X chromosomes. Slightly later in development, dosage compensation equalizes X chromosome expression in males and females and so X chromosomes can no longer be counted. The second Sxl promoter is the "maintenance promoter," (SxlPM), which is activated in both sexes. The mRNA produced from the maintenance promoter has to be alternatively splicing from early Sxl protein generated earlier in development by the PE. This leads to an autoregulatory loop that maintains Sxl expression in female somatic cells. The authors used fluorescent in situ hybridization (FISH) with oligopaints to determine the temporal activation of the PE or PM promoters. They find that - unlike the soma - the PE does not precede the PM and instead is activated contemporaneously or later than the PM - this is confusing with the later results (see below). Next, they generated transcriptional reporter constructs containing large segments of the Sxl locus, the 1.5 kb used in somatic studies, a 5.2 kb reporter, and a 10.2 kb. Interestingly the 1.5 kb reporter that was reported to recapitulate Sxl expression in soma and germline was not observed by the authors. The 5.2 kb reporter was observed in female somatic cells but not in germ cells. Only when they include an additional 5 kb downstream of the 5.2 kb reporter (here the 10.2 kb reporter) they did see expression in germ cells but this occurred at the L1 stages. Their data indicate that Sxl activity in the germ requires different cis-regulation than the soma and that the PE is activated later in germ cells than in somatic cells. The authors next use gene editing to insert epitope tags in two distinct strains in the hopes of creating an early Sxl and a later Sxl protein derived from the PE and PM, respectively. The HA-tagged protein from the PE was seen in somatic cells but never in the germline, possibly due to very low expression. The FLAG-tagged late Sxl protein is observed in L2 germ cells. Because the early HA-Sxl protein is not perceptible in germ cells, it is not possible to conclude its role in the germline. However, because late FLAG-Sxl was only observed in L2 germ cells and the PE was detected in L1, this leaves open the possibility that PE produces early HA-Sxl (which currently cannot be detected), which then alternatively splices the transcript from the PM. In other words, the soma and germline could have a similar temporal relationship between the two Sxl promoters. While I agree with the authors about this conclusion, the earlier work with the oligopaints leads to the conclusion that SE is active after PM. This is confusing. 

      The temporal relationship between Sxl PE and Sxl PM in the germline is indeed confusing. One source of confusion comes from whether one is discussing Sxl protein production or promoter activity. As the reviewer nicely summarizes, our transcription analysis with oligopaints indicates that, unlike in the soma, Sxl PE is NOT on in the germline prior to PM. Our other data indicate that PE is instead likely only active well after transcription from PM has begun. However, this still means that the temporal order of the EARLY and LATE Sxl proteins can be the same as the soma. Even if PM is active well before PE in the germline, the PE transcript cannot produce any functional protein in the absence of being alternatively spliced by the Sxl protein (Sxl autoregulation). Thus, even if PM is active before PE in the germline, we would not expect to observe any LATE Sxl protein until the PE promoter comes on, and produces a pulse of EARLY Sxl protein. The fact that we observe LATE Sxl protein at L2 is consistent with our observation that the 10.2 kb Sxl PE reporter is active at L1. We will attempt to explain all of this better in a revised manuscript.

      Next, the authors wanted to turn their attention to the trans-acting factors that regulate Sxl in the soma, including Sisterless A (SisA), SisB, Runt, and the JAK/STAT ligand Unpaired. Using germline RNAi, the authors found that only knockdown of SisA causes ovarian tumors, similar to the loss of Sxl, suggesting that SisA regulates Sxl (ie the PE) in both the soma and the germline. They generated a SisA null allele using CRISPR/Cas9 and these animals had ovarian tumors and germ cell-less ovaries. FISH revealed that sisA is activated in primordial germ cells in stages 3-6 before the activation of Sxl. They used CRISPR-Cas9 to generate an endogenously-tagged SisA and found that tagged SisA was expressed in stage 3-6 PCGs, which is consistent with activating PE in the germline. They showed that sisA is upstream of Sxl as germline depletion of sisA led to a significant decrease in expression from the 10.2 kb PE reporter and in SXL protein. The authors could rescue the ovarian tumors and loss of Sxl protein upon germline depletion of sisA by supplying Sxl from another protein (the otu promoter). These data indicate that sisA is necessary for Sxl activation in the germline. However, ectopic sisA in germ cells in the testis did not lead to ectopic Sxl, suggesting that sisA is not sufficient to activate Sxl in the germline. 

      Strengths: 

      (1) The genetic and genomic approaches in this study are top-notch and they have generated reagents that will be very useful for the field. 

      (2) Excellent use of powerful approaches (oligo paint, reporter constructs, CRISPR-Cas9 alleles). 

      (3) The combination of state of art approaches and quantification of phenotypes allows the authors to make important conclusions. 

      Weaknesses: 

      (1) Confusion in line 127 (this indicates that SxlPE is not activated before SxlPM in the germline) about PE not being activated before the PM in the germline when later figures show that PE is activated in L1 and late Sxl protein is seen in L2. It would be helpful to the readers if the authors edited the text to avoid this confusion. Perhaps more explanation of the results at specific points would be helpful. 

      We agree--see response above.

      Reviewer #3 (Public Review): 

      Summary: 

      The mechanisms governing the initial female-specific activation of Sex-lethal (Sxl) in the soma, the subsequent maintenance of female-specific expression and the various functions of Sxl in somatic sex determination and dosage compensation are well documented. While Sxl is also expressed in the female germline where it plays a critical role during oogenesis, the pathway that is responsible for turning Sxl on in germ cells has been a long-standing mystery. This manuscript from Goyal et al describes studies aimed at elucidating the mechanism(s) for the sex-specific activation of the Sex-lethal (Sxl) gene in the female germline of Drosophila. 

      In the soma, the Sxl establishment promoter, Sxl-Pe, is regulated in pre-cellular blastoderm embryos in somatic cells by several X-linked transcription factors (sis-a, sis-b, sis-c and runt). At this stage of development, the expression of these transcription factors is proportional to gene dose, 2x females and 1x in males. The cumulative two-fold difference in the expression of these transcription factors is sufficient to turn Sxl-Pe on in female embryos. Transcripts from the Sxl-Pe promoter encode an "early" version of the female Sxl protein, and they function to activate a splicing positive autoregulatory loop by promoting the female-specific splicing of the initial pre-mRNAs derived from the Sxl maintenance promoter, Sxl-Pm (which is located upstream of Sxl-Pm). These female Sxl-Pm mRNAs encode a Sxl protein with a different N-terminus from the Sxl-Pe mRNAs, and they function to maintain female-specific splicing in the soma during the remainder of development. 

      In this manuscript, the authors are trying to understand how the Sxl-Pm positive autoregulatory loop is established in germ cells. If Sxl-Pe is used and its activation precedes Sxl-Pm as is true in the soma, they should be able to detect Sxl-Pe transcripts in germ cells before Sxl-Pm transcripts appear. To test this possibility, they generated RNA FISH probes complementary to the Sxl-Pe first exon (which is part of an intron sequence in the Sxl-Pm transcript) and to a "common sequence" that labels both Sxl-Pe and Sxl-Pm transcripts. Transcripts labeled by both probes were detected in germ cells beginning at stage 5 (and reaching a peak at stage 10), so either the Sxl-Pm and Sxl-Pe promoters turn on simultaneously, or Sxl-Pe is not active. 

      They next switched to Sxl-Pe reporters. The first Sxl-Pe:gfp reporter they used has a 1.5 kb upstream region which in other studies was found to be sufficient to drive sex-specific expression in the soma of blastoderm embryos. Also like the endogenous Sxl gene it is not expressed in germ cells at this early stage. In 2011, Hashiyama et al reported that this 1.5 kb promoter fragment was able to drive gfp expression in Vasa-positive germ cells later in development in stage 9/10 embryos. However, because of the high background of gfp in the nearby soma, their result wasn't especially convincing. Though they don't show the data, Goyal et al indicated that unlike Hashiyama et al they were unable to detect gfp expressed from this reporter in germ cells. Goyal et al extended the upstream sequences in the reporter to 5 kb, but they were still unable to detect germline expression of gfp. 

      Goyal et al then generated a more complicated reporter which extends 5 kb upstream of the Sxl-Pe start site and 5 kb downstream-ending at or near 4th exon of the Sxl-Pm transcript (the Sxl-Pe10 kb reporter). (The authors were not explicit as to whether the 5 kb downstream sequence extended beyond the 4th exon splice junction-in which case splicing could potentially occur with an upstream exon(s)-or terminated prior to the splice junction as seems to be indicated in their diagram.) With this reporter, they were able to detect sex-specific gfp expression in the germline beginning in L1 (first instar larva). With the caveat that gfp detection might be delayed compared to the onset of reporter activation, these findings indicated that the sequences in the reporter are able to drive sex-specific transcription in the germline at least as early as L1. 

      The authors next tagged the N-terminal end of the Sxl-Pe protein with HA (using Crispr/Cas9) and the N-terminal end of Sxl-Pm protein with Flag. They report that the HA-Sxl-Pe protein is first detected in the soma at stage 9 of embryogenesis. Somatic HA-Sxl-Pe protein persists into L1, but is no longer detected in L2. However, while somatic HA-Sxl-Pe protein is detected, they were unable to detect HA-Sxl-Pe protein in germ cells. In the case of FLAG-Sxl-Pm, it could first be detected in L2 germ cells indicating that at this juncture the Sxl-positive autoregulatory loop has been activated. This contrasts with Sxl-Pm transcripts which are observed in a few germ cells at stage 5 of embryogenesis, and in most germ cells by stage 10. The authors propose (based on the expression pattern of the Sxl-Pe10kb reporter and the appearance of Flag-Sxl-Pm protein) that Sxl-Pe comes on in germ cells in L1, and that the Sxl-Pe protein activates the female splicing of Sxl-Pm transcripts, giving detectable Flag-Sxl-Pm proteins beginning in L2. 

      To investigate the signals that activate Sxl-Pe in germ cells, the authors tested four of the X-linked genes (sis-a, sis-b, sis-c, and runt) that function to activate Sxl-Pe in the soma in early embryos. RNAi knockdown of sis-b, sis-c, and runt had no apparent effect on oogenesis. In contrast, knockdown of sis-a resulted in tumorous ovaries, a phenotype associated with Sxl mutations. (Three different RNAi transgenes were tested-two gave this phenotype, the third did not.) Sxl-Pe10kb reporter activity in L1 female germ cells is also dependent on sis-A. 

      Several approaches were used to confirm a role for sis-a in a) oogenesis and b) the activation of the Sxl-Pm autoregulatory loop. They showed that sis-a germline clones (using tissue-specific Crispr/Cas9 editing) resulted in the tumorous ovary phenotype and reduced the expression of Sxl protein in these ovaries. They found that sis-a transcripts and GFP-tagged Sis-A protein are present in germ cells. Finally, they showed tumorous ovary phenotype induced by germline RNAi knockdown of sis-a can be partially rescued by expressing Sxl in the germ cells. 

      Critique: 

      While this manuscript addresses a longstanding puzzle - the mechanism activating the Sxl autoregulatory loop in female germ cells-and likely identified an important germline transcriptional activator of Sxl, sis-a, the data that they've generated doesn't make a compelling story. At every step, there are puzzle pieces that don't fit the narrative. In addition, some of their findings are inconsistent with many previous studies. 

      We respect and appreciate this reviewer for the detailed comments. However, we feel that the claim that our work doesn’t “make a compelling story” and that many “pieces…don’t fit the narrative” is incorrect. The main issue that this reviewer raises is that we do not know if Sxl “early” transcription in the germline initiates from the Pe promoter. This is true, which we fully acknowledge, but the detail of whether “germline early” transcription of Sxl initiates from Pe or from other, as yet undefined, germline promoter does not affect the main conclusions of the paper. These conclusions are that a) regulation of Sxl in the germline is fundamentally different from in the soma and 2) despite point (1), sisA acts as an activator of Sxl in both the soma and the germline. Neither of these main points is disputed by this reviewer.

      (1) The authors used RNA FISH to time the expression of Sxl-Pe and Sxl-Pm transcripts in germ cells. Transcripts complementary to Sxl-Pe and Sxl-Pm were detected at the same time in embryos beginning at stage 5. This is not a definitive experiment as it could mean a) that Sxl-Pe and Sxl-Pm turn on at the same time, b) that Sxl-Pe comes on after Sxl-Pm (as suggested by the Sxl-Pe10kb reporter) or c) Sxl-Pe never comes on. 

      When designing this experiment, we wanted to test whether the “soma model” of Pe activation before Pm was also true in the germ cells. Our data clearly demonstrate that transcripts beginning downstream of Pe are not expressed prior to transcripts beginning downstream of Pm. Thus, we can state that the “soma model” of Pe first and then Pm does not occur in the germline, which is very interesting. However, we cannot make any other conclusions about Pe in the germline from these data, as the reviewer indicates.

      (2) Hashiyama et al reported that they detected gfp expression in stage 9/10 germ cells from a 1.5 kb Sxl-Pe-gfp. As noted above, this result wasn't entirely convincing and thus it isn't surprising that Goyal et al were unable to reproduce it. Extending the upstream sequences to just before the 1st exon of Sxl-Pm transcripts also didn't give gfp expression in germ cells. Only when they added 5 kb downstream did they detect gfp expression. However, from this result, it isn't possible to conclude that the Sxl-Pe promoter is actually driving gfp expression in L1 germ cells. Instead, the Sxl promoter active in the germ line could be anywhere in their 10 kb reporter. 

      We agree that we have not determined the transcriptional start sites for Sxl in the germline and it is possible that the 10.2 kb reporter uses a different promoter than Pe, as long as that transcript can also be spliced into exon 4 where the GFP tag has been placed. The three types of experiments conducted—FISH to regions of the nascent transcripts, tagged versions of the different predicted ORFs, and promoter-GFP constructs—are extensive, but all have different limitations. Indeed, it would be challenging to determine the transcription start sites in the germline, as it would require obtaining enough L1 larvae to be able to dissociate the animals, or isolated gonads, into single cells in order to FACS purify the germ cells for RACE or long-read sequencing (I’m not sure that L1 larval single-nucleus seq would be enough for calling start sites). Otherwise, there would be no way to determine if expected or unexpected transcripts came from the soma or the germline. We can consider these experiments in the future.

      Fortunately, the main conclusions from this paper do not require knowing whether the germline uses Pe or some other “germline early” promoter that can produce Sxl protein in the absence of autoregulation by existing Sxl protein. The observations that a nascent transcript including the region downstream of Pm is observed in embryonic germ cells, but that the tagged LATE protein is not observed until L2, suggest that the transcript produced in early germ cells cannot produce a functional protein. This is consistent with the need for Sxl autoregulation of the Pm transcript in the germline as in the soma, as was previously thought. This is further supported by the observations that activity of the 10.2 kb reporter is only observed in L1 germ cells, and that the LATE Sxl protein is only observed in germ cells after this point. Thus, we can conclude that either Pe, or another “germline early” promoter, acts to produce female-specific Sxl protein to initiate autoregulation of Sxl splicing and protein production in the germline. We feel that this is a significant advance for the field, and we will make it more clear in the text that the initial expression of Sxl in the germline may not be from the Pe promoter.

      Other conclusions of the manuscript are unaffected by the start site for “germline early” Sxl transcription, including that the germline activates Sxl protein expression much later than the soma, which calls into question previous work indicating an early role for Sxl in the germline. Also unaffected is our conclusion that different enhancer sequences are required for activation of Sxl expression in the germline than in the soma, consistent with previous work demonstrating that the genetics of Sxl activation in the germline are different than in the soma. Lastly, our conclusions that sisA acts upstream of Sxl, and is required for Sxl germline expression, either directly or indirectly, are also unaffected by the nature of the Sxl “germline early” start site.

      (3) At least one experiment suggests that Sxl-Pe never comes on in germ cells. The authors tagged the N-terminus of the Sxl-Pe protein with HA and the N-terminus of the Sxl-Pm protein with Flag. Though they could detect HA-Sxl-Pe protein in the soma, they didn't detect it in germ cells. On the other hand, the Flag-Sxl-Pm protein was detected in L2 germ cells (but not earlier). These results would more or less fit with those obtained for the 10 kb reporter and would support the following model: Prior to L1, Sxl-Pm transcripts are expressed and spliced in the male pattern in both male and female germ cells. During L1, Sxl protein expressed via a mechanism that depends upon a 10 kb region spanning Sxl-Pe (but not on Sxl-Pe) is produced and by L2 there are sufficient amounts of this protein to switch the splicing of Sxl-Pm transcripts from a male to a female pattern-generating Flag-tagged Sxl-Pm protein. 

      As described above, it is indeed possible that another promoter besides Pe is active as the “germline early” promoter. We will make this more clear in a revised version, but the major conclusions of the manuscript are unaffected.

      (4) The 10kb reporter is sex-specific, but not germline-specific. The levels of gfp in female L1 somatic cells are equal to if not greater than those in L1 female germ cells. That the Sxl-Pe10kb reporter is active in the soma complicates the conclusion that it represents a germ line-specific promoter. Germline activity is, however, sensitive to sis-A knockdowns which is plus. Presumably, somatic expression of the reporter wouldn't be sensitive to a (late) sis-A knockdown- but this wasn't shown. 

      We are confused by this comment because we do not conclude that the Pe is a germline-specific promoter. Pe is known to be expressed in the soma, from considerable previous work cited by this reviewer, and the simplest model is that Pe is used in both the soma and the germline, as reflected by our 10.2 kb reporter. It is actually quite interesting how late this promoter seems active in the soma, contrary to current dogma, but we did not study somatic activation of Sxl in this work.

      (5) Their results with the HA-Sxl-Pe protein don't fit with many previous studies-assuming that the authors have explained their results properly. They report that HA-Sxl-Pe protein is first detected in the soma at stage 9 of embryogenesis and that it then persists till L2. However, previous studies have shown that Sxl-Pe transcripts and then Sxl-Pe proteins are first detected in ~NC11-NC12 embryos. In RNase protection experiments, the Sxl-Pe exon is observed in 2-4 hr embryos, but not detected in 5-8 hr, 14-12 hr, L1, L2, L3, or pupae. Northerns give pretty much the same picture. Western blots also show that Sxl-Pe proteins are first detectable around the blastoderm stage. So it is not at all clear why HA-Sxl-Pe proteins are first observed at stage 9 which, of course, is well after the time that the Sxl-Pm autoregulatory loop is established. 

      Given the obvious problems with the initial timing of somatic expression described here, it is hard to know what to make of the fact that HA-tagged Sxl-Pe proteins aren't observed in germ cells. 

      As for the presence of HA-Sxl-Pe proteins later than expected: While RNase protection/Northern experiments showed that Sxl-Pe mRNAs are expressed in 2-4 hr embryos and disappear thereafter, one could argue from the published Western experiments that the Sxl-PE proteins expressed at the blastoderm stage persist at least until the end embryogenesis, though perhaps at somewhat lower levels than at earlier points in development. So the fact that Goyal et al were able to detect HA-Sxl-Pe proteins in stage 9 embryos and later on in L1 larva probably isn't completely unexpected. What is unexpected is that the HA-Sxl-Pe proteins weren't present earlier. 

      We thank the reviewer for this detailed analysis. Since we were not focused on somatic expression of Sxl in this work, it is possible that stage 9 was the earliest stage we observed in our experiments, rather than the earliest stage in which it is ever observed. We will repeat these experiments to verify when the HA-tagged early Sxl protein is first observed. However, these comments have no bearing on our conclusions about Sxl expression in the germline, which is the focus of this manuscript.

      (6) The authors use RNAi and germline clones to demonstrate that sis-A is required for proper oogenesis: when sis-A activity is compromised in germ cells, i) tumorous ovary phenotypes are observed and ii) there is a reduction in the expression of Sxl-Pm protein. They are also able to rescue the phenotypic effects of sis-a knockdown by expressing a Sxl-Pm protein. While the experiments indicating sis-a is important for normal oogenesis and that at least one of its functions is to ensure that sufficient Sxl is present in the germline stem cells seem convincing, other findings would make the reader wonder whether Sis-A is actually functioning (directly) to activate Sxl transcription from promoter X. 

      It is true that we do not know the binding specificity for SisA, which is why we have made no claims about the directness of SisA regulation of Sxl. This does not change our conclusions that sisA is upstream of Sxl activation, since loss of sisA function has a similar phenotype to loss of Sxl, loss of sisA blocks Sxl protein expression, and expression of Sxl rescues the sisA mutant phenotype.

      The authors show that sis-a mRNAs and proteins are expressed in stage 3-5 germ cells (PGCs). This is not unexpected as the X-linked transcription factors that turn Sxl-Pe on are expressed prior to nuclear migration, so their protein products should be present in early PGCs. The available evidence suggests that their transcription is shut down in PGCs by the factors responsible for transcriptional quiescence (e.g., nos and pgc) in which case transcripts might be detected in only one or two PGC-which fits with their images. However, it is hard to believe that expression of Sis-A protein in pre-blastoderm embryos is relevant to the observed activation of the Sxl-Pm autoregulatory loop hours later in L2 larva. 

      It is also not clear how the very low level of gfp-Sis-A seen in only a small subset of migrating germ cells in stage 10 embryos (Figure S6) would be responsible for activating the Sxl-Pe10kb reporter in L1. It seems likely that the small amount of protein seen in stage 10 embryos is left over from the pre-cellular blastoderm stage. In this case, it would not be surprising to discover that the residual protein is present in both female and male stage 10 germ cells. This would raise further doubts about the relevance of the gfp-Sis-A at these early stages. 

      In fact, given the evidence presented implicating sis-a in activating Sxl, (the germline activation of the Sxl-Pe10kb reporter, the RNAi knockdowns, and the germ cell-specific sis-a clones) it is clear that the sis-A RNAs and proteins seen in pre-cellular blastoderm PGCs aren't relevant. The germline clone experiment (and also the RNAi knockdowns) indicates that sis-A must be transcribed in germ cells after Cas9 editing has taken place. Presumably, this would be after transcription is reactivated in the germline (~stage 10) and after the formation of the embryonic gonad (stage 14) so that the somatic gonadal cells can signal to the germ cells. With respect to the reporter, the relevant time frame for showing that sis-A is present in germ cells would be even later in L1. 

      The reviewer is correct in wondering how early sisA transcription can affect late Sxl activation, and we are clear about this conundrum in our manuscript. However, they are incorrect about the early sisA expression. Our experiments examining nascent sisA transcripts indicate that sisA is zygotically expressed in the formed germ cells rather than being leftover from expression in early nuclei. The fact that only a portion of germ cells express sisA at any time may well be due to a timing issue, where not all germ cells express sisA at the same time. They are also incorrect about the timing of Cas9 editing in the germline—the guide RNAs are expressed from a general promoter that is active both maternally and in the early embryo, and the Cas9 RNA from the nos promoter is deposited in the germ plasm where it is translated long before cellularization, meaning that sisA CRISPR knockout can begin at the earliest stages of germ cell formation or before.

      (7) As noted above, the data in this manuscript do not support the idea that Sxl-Pe proteins activate the Sxl-Pm female splicing in the germline. Flybase indicates that there is at least one other Sxl promoter that could potentially generate a transcript that includes the male exon but still could encode a Sxl protein. This promoter "Sxl-Px" is located downstream of Sxl-Pm and from its position it could have been included in the authors' 10 kb reporter. The reported splicing pattern of the endogenous transcript skips exon2, and instead links an exon just downstream of Sxl-Px to the male exon. The male exon is then spliced to exon4. If the translation doesn't start and end at one of the small upstream orfs in the exons close to Sxl-Px and the male exon, a translation could begin with an AUG codon in exon4 that is in frame with the Sxl protein coding sequence. This would produce a Sxl protein that lacks aa sequences from N-terminus, but still retains some function. 

      Another possible explanation for how gfp is expressed from the 10 kb reporter is that the transcript includes the "z" exon described by Cline et al., 2010.

      As discussed above, the exact location of the start site for the Sxl transcript in the germline remains to be determined, but does not affect the main conclusions of the paper.

    1. Reviewer #1 (Public Review):

      Review after revision

      Of note the main results of this article are very similar to the results present in the previous manuscript (same Figures 1 to 9, addition of Figure 10 with no quantification).<br /> Unfortunately, the main weaknesses of the article have not been addressed:

      (1) The main findings have been obtained in clones of Jurkat cells. They have not been confirmed in primary T cells. The only experiment performed in primary cells is shown in Figure S7 (primary human T lymphoblasts) for which only the distribution of FMNL1 is shown without quantification. No results presenting the effect of FMNL1 KO and expression of mutants in primary T cells are shown.

      (2) Analysis in- depth of the defect in actin remodeling (quantification of the images, analysis of some key actors of actin remodeling) is still lacking. Only F-actin is shown, no attempt to look more precisely at actors of actin remodeling has been done.

      (3) The defect in the secretion of extracellular vesicles is still very preliminary. Examples of STED images given by the authors are nice, yet no quantification is performed.

      (4) Results shown in Figure S12 on the colocalization of proteins phosphorylated on Ser/Thr are still not convincing. It seems indeed that "phospho-PKC" is labeling more preferentially the CMAC positive cells (Raji) than the Jurkat T cells. It is thus particularly difficult to conclude on the co-localization and even more on the recruitment of phosphorylated-FMNL1 at the IS. Thus, these experiments are not conclusive and cannot be the basis even for their cautious conclusion: "Although all these data did not allow us to infer that FMNL1b is phosphorylated at the IS due to the resolution limit of confocal and STED microscopes, the results are compatible with the idea that both endogenous FMNL1 and YFP-FMNL1bWT are specifically phosphorylated at the cIS".

      The study would benefit from a more careful statistical analysis. The dot plots showing polarity are presented for one experiment. Yet, the distribution of the polarity is broad. Results of the 3 independent experiments should be shown and a statistical analysis performed on the independent experiments.

    2. Reviewer #2 (Public Review):

      Summary

      Based on i) the documented role of FMNL1 proteins in IS formation; ii) their ability to regulate F-actin dynamics; iii) the implication of PKCdelta in MVB polarization to the IS and FMNL1beta phosphorylation; and iv) the homology of the C-terminal DAD domain of FMNL1beta with FMNL2, where a phosphorylatable serine residue regulating its auto-inhibitory function had been previously identified, the authors have addressed the role of S1086 in the FMNL1beta DAD domain in F-actin dynamics, MVB polarization and exosome secretion, and investigated the potential implication of PKCdelta, which they had previously shown to regulate these processes, in FMNL1beta S1086 phosphorylation. They demonstrate that FMNL1beta is indeed phosphorylated on S1086 in a PKCdelta-dependent manner and that S1086-phosphorylated FMNL1beta acts downstream of PKCdelta to regulate centrosome and MVB polarization to the IS and exosome release. They provide evidence that FMNL1beta accumulates at the IS where it promotes F-actin clearance from the IS center, thus allowing for MVB secretion.

      Strengths

      The work is based on a solid rationale, which includes previous findings by the authors establishing a link between PKCdelta, FMNL1beta phosphorylation, synaptic F-actin clearance and MVB polarization to the IS. The authors have thoroughly addressed the working hypotheses using robust tools. Among these, of particular value is an expression vector that allows for simultaneous RNAi-based knockdown of the endogenous protein of interest (here all FMNL1 isoforms) and expression of wild-type or mutated versions of the protein as YFP-tagged proteins to facilitate imaging studies. The imaging analyses, which are the core of the manuscript, have been complemented by immunoblot and immunoprecipitation studies, as well as by the measurement of exosome release (using a transfected MVB/exosome reporter to discriminate exosomes secreted by T cells).

      Weaknesses

      The authors have satisfactorily addressed the weaknesses pointed out in my previous review.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      First, all the experiments are performed in Jurkat T cells that may not recapitulate the regulation of polarization in primary T cells.

      To extend our results in Jurkat cells forming IS to primary cells, we have now performed experiments using synapses established by Raji cells and either primary T cells  (TCRmediated) or primary CAR T cells (CAR-mediated) (new Suppl. Fig. S7). These experiments clearly show the presence of FMNL1 at these two different IS classes (new Suppl. Fig. S7), similar to what was found in Jurkat-Raji synapses. In addition, since most of the experiments were performed in Jurkat cells, we have changed the title of our manuscript, to be faithful to the main body of our results. New sentences dealing with this important issue have been included in the Results and Discussion sections.

      Moreover, all the experiments analyzing the role of PKCdelta are performed in one clone of wt or PKCdelta KO Jurkat cells. This is problematic since clonal variation has been reported in Jurkat T cells.

      Referee is right, this is the reason why we have studied three different control clones (C3, C9, C7) and three PKCdelta-interfered clones (P5, P6 and S4) all derived from JE6.1 clone and the results have been previously published (Herranz et al 2019)(Bello-Gamboa et al 2020). All these clones expressed similar levels of the relevant cell surface molecules and formed synaptic conjugates with similar efficiency (Herranz et al 2019). The P5, P6 and S4 clones exhibited a similar defect in MVB/MTOC polarization when compared with the control clones (Herranz et al 2019)(Bello-Gamboa et al 2020). Experiments developed by other researchers using a different clone of Jurkat (JE6.1) and primary CD4+ and CD8+ lymphocytes interfered in FMNL1 (Gomez et al. 2007), showed a comparable defect in MTOC polarization to that found in our control clones when were transiently interfered in FMNL1 (Bello-Gamboa et al 2020, this manuscript). In this manuscript we have studied, instead of canonical JE6.1 clone, C3 and C9 control clones derived from JE6.1, since the puromycin-resistant control clones (containing a scramble shRNA) were isolated by limiting dilution together with the PKCdelta-interfered clones (Herranz et al. 2019), thus C3 and C9 clones are the best possible controls to compare with P5 and P6 clones. Please realize that microsatellite analyses, available upon request, supports the identity of our C3 clone with JE6.1. Moreover, when GFP-PKCdelta was transiently expressed in the three PKCdelta-interfered clones, MTOC/MVB polarization was recovered to control levels (Herranz et al. 2019). Therefore, the deficient MTOC/MVB polarization in all these clones is exclusively due to the reduction in PKCdelta expression (Herranz et al 2019), and thus clonal variation cannot underlie our results in stable clones. We have now included new sentences to address this important point and to mention the inability of FMNL1betaS1086D to revert the deficient MTOC polarization occurring in P6 PKCdelta-interfered clone, as occurred in P5 clone. Due to the fact we have now included more figures and panels to satisfy editor and referees’s comments, we have not included the dot plot data corresponding to C9 and P6 clones to avoid a too long and repetitive manuscript. Since all the FMNL1 interference and FMNL1 variants reexpression experiments were performed in transient assays (2-4 days after transfection), there was no chance for any clonal variation in these short-time experiments. Moreover, internal controls using untransfected cells or Raji cells unpulsed with SEE were carried out in all these transient experiments.

      Finally, although convincing, the defect in the secretion of vesicles by T cells lacking phosphorylation of FMNL1beta on S1086 is preliminary. It would be interesting to analyze more precisely this defect. The expression of the CD63‑GFP in mutants by WB is not completely convincing. Are other markers of extracellular vesicles affected, e.g. CD3 positive?

      We acknowledge this comment. It is true that the mentioned results do not directly demonstrate the presence of exosomes at the synaptic cleft of the synapses, since the nanovesicles were harvested from the cell culture supernatants from synaptic conjugates and these nanovesicles could be produced by multi‑directional degranulation of MVBs. To address this important issue, we have performed STED super‑resolution imaging of the immune synapses made by control and FMNL1-interfered cells. Nanosized (100-150 nm) CD63+ vesicles can be found in the synaptic cleft between APC and control cells with polarized MVBs, whereas we could not detect these vesicles in the synaptic cleft from FMNL1-interfered cells that maintain unpolarized MVBs (New Fig. 10). New sentences have been included in the Results and Discussion dealing with this important point. Regarding the use of CD3 as a marker of extracellular vesicles, please realize that CD3 is neither an enriched nor a specific marker of exosomes, since it is also present in plasma membrane shedding vesicles, molting vesicles from microvilli, apoptotic bodies and small cell fragments, apart from exosomes, thus we have preferred to use the canonic exosome marker CD63 as a general exosome reporter readout, for WB and immunofluorescence (MVBs, exosomes), time-lapse of MVBs (suppl. Video 8) and super resolution experiments (Fig. 10).   

      Reviewer #2 (Public Review):

      Summary:

      The authors have addressed the role of S1086 in the FMNL1beta DAD domain in 4 F-actin dynamics, MVB polarization, and exosome secretion, and investigated the potential implication of PKCdelta, which they had previously shown to regulate these processes, in FMNL1beta S1086 phosphorylation. This is based on:

      (1) the documented role of FMNL1 proteins in IS formation

      (2) their ability to regulate F-actin dynamics

      (3) the implication of PKCdelta in MVB polarization to the IS and FMNL1beta phosphorylation

      (4) the homology of the C-terminal DAD domain of FMNL1beta with FMNL2, where a phosphorylatable serine residue regulating its auto-inhibitory function had been previously identified. They demonstrate that FMNL1beta is indeed phosphorylated on S1086 in a PKCdelta-dependent manner and that S1086-phosphorylated FMNL1beta acts downstream of PKCdelta to regulate centrosome and MVB polarization to the IS and exosome release. They provide evidence that FMNL1beta accumulates at the IS where it promotes F-actin clearance from the IS center, thus allowing for MVB secretion.  

      Strengths

      The work is based on a solid rationale, which includes previous findings by the authors establishing a link between PKCdelta, FMNL1beta phosphorylation, synaptic F-actin clearance, and MVB polarization to the IS. The authors have thoroughly addressed the working hypotheses using robust tools. Among these, of particular value is an expression vector that allows for simultaneous RNAi-based knockdown of the endogenous protein of interest (here all FMNL1 isoforms) and expression of wild-‐‑type or mutated versions of the protein as YFP‐tagged proteins to facilitate imaging studies. The imaging analyses, which are the core of the manuscript, have been complemented by immunoblot and immunoprecipitation studies, as well as by the measurement of exosome release (using a transfected MVB/exosome reporter to discriminate exosomes secreted by T cells).

      Weaknesses

      The data on F-‐‑actin clearance in Jurkat T cells knocked down for FMNL1 and expressing wild-type FMNL1 or the non‑phosphorylatable or phosphomimetic mutants thereof would need to be further strengthened, as this is a key message of the manuscript. Also, the entire work has been carried out on Jurkat cells. Although this is an excellent model easily amenable to genetic manipulation and biochemical studies, the key finding should be validated on primary T cells

      Referee’s global assessment is right. To extend our results in Jurkat cells forming IS, we have now performed experiments using synapses established by Raji cells and either primary T cells (TCR-mediated) or primary CAR T cells (CAR-mediated) (new Suppl. Fig. S7). These experiments clearly show the presence of FMNL1 at these two different IS classes (new Suppl. Fig. S7), similar to what was found in Jurkat-Raji synapses. In addition, since most of the experiments were performed in Jurkat cells, we have changed the title of our manuscript, to be faithful to the main body of our results. New sentences have been included in Results and Discussion to address these important points.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      This study shows the role of the phosphorylation of FMNL1b on S1086 on the polarity of T lymphocytes in T lymphocytes, which is a new and interesting finding. It would be important to confirm some of the key results in primary T cells and to analyze in-depth the defect in actin remodeling (quantification of the images, analysis of some key actors of actin remodeling). The description of the defect in the secretion of extracellular vesicles would also benefit from a more accurate analysis of the content of vesicles. 

      Referee is right.  We have now performed experiments using synapses containing Raji cells and either primary T cells (TCR-mediated) or primary CAR T cells (CAR-mediated) (new Suppl. Fig. S7). These experiments clearly show the presence of FMNL1 at these two different IS classes, similar to what was found in Jurkat-‐‑Raji synapses. Moreover, since most of the experiments were performed in Jurkat cells, we have changed the title of our manuscript, to be faithful to the main body of our results. Regarding the use of CD63 instead of other markers such as for instance,  CD3 (as stated by the other referee), please realize that CD3 is neither an enriched nor a specific marker of exosomes, since it is also present in plasma membrane shedding vesicles, molting vesicles from microvilli, apoptotic bodies and small cell fragments, apart from exosomes, thus we have preferred to use the accepted consensus, canonic extracellular vesicle marker CD63 (International Society of Extracellular Vesicles positioning, Thery et al 2018, doi: 10.1080/20013078.2018.1535750. eCollection 2018., Alonso et al. 2011) as a general exosome reporter readout, for both WB, immunofluorescence (MVBs, exosomes) and super-resolution experiments. Accordingly, GFP-‐‑CD63 reporter plasmid was used for exosome secretion in transient expression studies and living cell time-lapse experiments (Suppl. Video 8). Any other exosome marker will also be present in Raji cells and will not allow to analyse exclusively the secretion of exosomes by the effector Jurkat cells, since B lymphocytes produce a large quantity of exosomes upon MHC‑II stimulation by Th lymphocytes (Calvo et al, 2020, doi:10.3390/ijms21072631). To reinforce the exosome data in the context of the immune synapse, STED super-resolution imaging of the immune synapses made by control and FMNL1‑interfered cells was performed. Nanosized (100-150 nm) CD63+ vesicles can be found in the synaptic cleft of control cells with polarized MVBs, whereas we could no detect these vesicles in the synaptic cleft from FMNL1-interfered cells that maintain unpolarized MVBs (new Fig. 10).

      Moreover, all the videos are not completely illustrative. For example, in video 2 it would be more appropriate show only the z plane corresponding to the IS to see more precisely the F-actin remodeling relative to CD63 labeling.

      Referee is right. It is true that the upper rows in some videos may distract the reader of the main message contained in the lower row, that includes the 90º turn-generated, zx plane corresponding to the IS interface. Accordingly, we have maintained the still images of the whole synaptic conjugates in the first row from video 2; this will allow the reader to perceive a general view of the fluorochromes on the whole cell conjugates, as a reference, and to compare precisely the F-actin remodeling relative to CD63 labeling only at the zx interface (lower row). We have now processed the videos 1 and 5 following similar criteria

      The quality of videos 3 and 4 are not good enough. For video 7, it seems that the labeling of phospho-‐‑Ser is very broad at the IS, which is expected since it should label all the proteins that are phosphorylated by PKCs. The resolution of microscopy (at the best 200 to 300 nm) does not allow us to conclude on the co-‐localization of FMNL1b with phospho-‐‑Ser and is thus not conclusive. Finally, the study would benefit from a more careful statistical analysis. The dot plots showing polarity are presented for one experiment. Yet, the distribution of the polarity is broad. Results of the 3 independent experiments should be shown and a statistical analysis performed on the independent experiments

      Referee is right, we have amended video settings (brightness/contrast) in videos 3 and 4 to improve this issue. In addition, we would like to remark that the translocation of proteins to cellular substructures in living cells is not a trivial issue, since certain protein localizations are too dynamic to be properly imaged with enough spatial resolution. The equilibrium resulting from the association/dissociation of a certain protein to the membrane, in addition to the protein diffusion naturally occurring in living cells, as well as signal intensity fluctuations inherent to the stochastic nature of fluorescence emission often provide barriers for image quality (Shroff et al, 2024). Thus, additional image blurring is expected when compared with that observed in fixed samples. However, we think it is important to provide the potential readers with a dynamic view of FMNL1 localization, which can only be achieved through real-time videos, in addition to the still frames from the same videos provided in Fig. 6A (the referee did not argue against the inclusion of these frames), together with images from fixed cells in Fig 6B, for comparison. This is the reason why we have preferred to maintain the improved videos to complement the results of some spare frames from the videos, together with images from fixed cells in the same figure (Fig. 6).

      Regarding video 7, we agree that colocalization is limited by the spatial resolution of confocal  microscopy,  and this fact does not allow us to infer that FMNL1beta is phosphorylated at the IS. However, please realize we have never concluded this in our manuscript.  Instead, we claimed that “colocalization of endogenous FMNL1 and YFP‑FMNL1βWT with anti‑phospho‑Ser  …is compatible with the idea that both endogenous FMNL1 and YFP‑FMNL1βWT are specifically phosphorylated at the cIS”. Moreover, we have now performed colocalization in super‑resolved STED microscopy images, that reduces the XY resolution down to 30-­40 nm (Suppl. Fig. S12), and the results also support colocalization of endogenous FMNL1 with anti-phospho‑Ser PKC at the IS within a 30 nm resolution limit. We have now somewhat softened our conclusion: “Although all these data did not allow us to infer that FMNL1β is phosphorylated at the IS due to the resolution limit of confocal and STED microscopes, the results are compatible with the idea that both endogenous FMNL1 and YFP-FMNL1βWT are specifically phosphorylated at the cIS”.   

      Regarding statistical analyses we agree the dot distribution in the polarity experiments is quite broad, but this is consistent with the end point strategy used by a myriad of research groups (including ourselves) to image an intrinsically stochastic, rapid and asynchronous processes such as immune synapse formation and to score MTOC/MVB  polarization (Calvo et al 2018, https://doi.org/10.3389/fimmu.2018.00684). Despite this fact,  ANOVA  analyses have underscored the statistical significance of all the experiments represented by dot plot experiments. We cannot average or perform meta statistical analyses by combining the equivalent cohort results from independent experiments, since we have observed that small variations of certain variables (SEE concentration, cell recovery, time after transfection, etc.) affect synapse formation and PI values among experiments without altering the final outcome in each case. Please, note that our manuscript includes now 10  multi‑panel figures,  12  multi‑panel supplementary figures and 8 videos, and it is already quite large.  Thus,  we feel the inclusion of redundant, triplicate dot plot figures will dilute and distract to any potential reader from the main message of our already comprehensive contribution. We have now included new sentences at the figure legends to remark ANOVA analyses were executed separately in all the 3 independent experiments.

      Reviewer #2 (Recommendations For The Authors):

      (1) The key findings should be validated on primary CD4+ T cells (of which Jurkat is a transformed model).

      Referee is right. However, as commented by the other referee, the data from activating surfaces clearly shows that the synaptic actin architecture of the immune synapse from primary CD8+ T cells is essentially indistinguishable and thus unbiased from that of Jurkat T cells, but different to that of primary CD4+ cells (Murugesan, 2016). Thus, our data in Jurkat T cells are directly applicable to the synaptic architecture of primary CD8+ cells. In addition, to definitely extend our results in Jurkat cells forming IS, we have performed experiments using synapses established by Raji cells and either primary T cells (TCR-mediated) or primary CAR T cells (CAR-mediated) (new Suppl. Fig. S7) challenged by Raji cells. We have preferred to work with mixed CD4+ and CD8+ cells in order to maintain potential interactions in trans between these subpopulations that may affect or influence IS formation. These experiments clearly show the presence of FMNL1 at these two different IS classes (new Suppl. Fig. S7), similar to what was found in JurkatRaji synapses. Moreover, since most of the experiments were performed in Jurkat cells as stated by the referee, we have changed the title of our manuscript, to circumscribe our results to the model we have used and to be faithful to the main body of our results.

      (2) The image of wt YFP-­FMNL1beta in Figure 4A displays a weak CD63 signal and shows an asymmetric polarization of both the centrosome and MVBs. It should be replaced with a more representative one.

      Referee is right. Accordingly, we have modified the CD63 channel settings (brightness/contrast) in this panel to make it comparable to the other panels in the same figure. In addition, thanks to this referee´s comment, we have realized the position of the MTOC (yellow dot) in the diagram in the right side of the YFP-FMNL1betaWT panels row appeared mislocated, producing the mentioned apparent asymmetry with respect to MVBs’s center of mass (green dot) position. This mistake leads to an apparent segregation between the position of the center of mass of these organelles which certainly does not correspond with the real image. We have now amended the scheme and we apologize for this mistake.

      (3) The images showing F-­actin clearance at the IS (Figure 8, S4, S5) are not very convincing, also when looking at the MFI along the T cell-­‐‑APC interface in the en-­‐face  views.  Since  the  F-­actin  signal  also  includes  some  signal  from  the  APC, transfecting T cells with an actin reporter to selectively image T cell actin could better clarify this key point.

      Referee´s point is correct. However, we (83), and other researchers using the proposed actin reporter approach in the same Raji/Jurkat IS model (Fig. 4 in ref 84) have already excluded the possibility that actin cytoskeleton of Raji cells can also contribute to the measurements of synaptic F-actin. In Materials and Methods, page 37, lines 1048-1055 we included this related sentence:  ¨It is important to remark that MHC-II-antigen triggering on the B cell side of the Th synapse does not induce noticeable F-­actin changes along the synapse (i.e. F-­actin clearing at the central IS), in contrast to TCR stimulation on T cell side (84) (85) (3). In addition, we have observed that majority of F‐‑actin changes along the IS belongs to the Jurkat cell (83). Thus, the contribution to the analyses of the residual, invariant F‐actin from the B cell is negligible using our protocol (83).

      Thus, we can exclude this caveat may affect our results.

      (4) A similar consideration applies to the MVB distribution in the en‑face images. For example, in Figure S5 the MVB profile, with some peripheral distribution, does not appear very different in cells expressing wt YFP‑tagged FMNL1beta versus the S1086A‑expressing cells.

      The referee's assessment regarding Supp. Figure S5 is valid. Using only the plot profile, the outcomes obtained with YFP-FMNL1βWT may appear comparable to those derived from YFP-FMNL1βS1086A. Nonetheless, this resemblance is attributed to the plot profile's exclusive consideration of the MVBs signal in the interface from the immune synapse region (white rectangle). The upper images (second row), where the whole cell is displayed, illustrate that in YFP-FMNL1βWT, MVB are specifically accumulated within this specific region, in contrast to the scattered distribution observed in YFP-FMNL1βS1086A, where MVB are dispersed throughout the cell without distinction. While MVBs are evident in both instances within the synapse region, the reason behind this observation is different. The YFP-FMNL1βWT transfected cell (third column) shows a pronounced MVB concentration within the synaptic area (white rectangle), which leads to MVB PI=0.52, whereas the YFP-FMNL1βS1086A transfected cell (fourth column), as it presents a scattered distribution of MVB throughout the cell, also exhibits some MVB (but only a small proportion of the total cellular MVB) in the synaptic area, which yields MVB PI=-0.09. Please realise that the position of the center of mass of the distribution of MVB (MVBC) labelled in this figure (white squares) is an unbiased parameter that mirrors MVB center of mass polarization. A new sentence has been included in the figure legend to clarify this important point.

      (5) The image in the first row in Figure 6B does not show a clear accumulation of FMNL1beta at the IS, possibly because the T cell is in contact with two APCs. This image should be replaced.

      Referee is right Therefore, we have replaced the quoted example with a single cell:cell synapse that shows a clearer and more localized accumulation in the cIS, thereby avoiding the mentioned caveat.

      (6) In Figure 2A the last row shows what appears to be a T:T cell conjugate (with one cell expressing the YFP-­‐‑tagged protein). The image should be replaced with another showing a T cell-­APC (blue) conjugate.

      Referee is right, we have accordingly replaced the mentioned image with a T cell:APC conjugate.

      (7) The Discussion is very long and dispersive. It would benefit from shortening it and making it more focused.

      Referee is right, we have shortened and focused it, by eliminating the whole second and third paragraphs of the discussion. Moreover, a whole paragraph in page 24 has been also deleted.

      We have also focussed the discussion towards the new data in primary T lymphocytes.

    1. eLife assessment

      This valuable manuscript describes evidence of sex differences in specific corticostriatal projections during alcohol consumption, and this is noteworthy given the increasing rates/levels of drinking in females and their liability for Alcohol Use disorder. The authors provide solid evidence of the lateralisation of the activity of the circuit, but other evidence is incomplete, particularly with regard to how the drinking measure relates to intoxication. There are some inconsistencies that make it difficult to reconcile the photometry and behavioral data. The findings would benefit from causal assessment in the future. The findings will be of interest to researchers investigating functional circuitry underlying alcohol-driven behaviors.

    2. Reviewer #1 (Public Review):

      Summary:

      This paper uses a model of binge alcohol consumption in mice to examine how the behaviour and its control by a pathway between the anterior insular cortex (AIC) to the dorsolateral striatum (DLS) may differ between males and females. Photometry is used to measure the activity of AIC terminals in the DLS when animals are drinking and this activity seems to correspond to drink bouts in males but not females. The effects appear to be lateralized with inputs to the left DLS being of particular interest.

      Strengths:

      Increasing alcohol intake in females is of concern and the consequences for substance use disorder and brain health are not fully understood, so this is an area that needs further study. The attempt to link fine-grained drinking behaviour with neural activity has the potential to enrich our understanding of the neural basis of behaviour, beyond what can be gleaned from coarser measures of volumes consumed etc.

      Weaknesses:

      The introduction to the drinking in the dark (DID) paradigm is rather narrow in scope (starting line 47). This would be improved if the authors framed this in the context of other common intermittent access paradigms and gave due credit to important studies and authors that were responsible for the innovation in this area (particularly studies by Wise, 1973 and returned to popular use by Simms et al 2010 and related papers; e.g., Wise RA (1973). Voluntary ethanol intake in rats following exposure to ethanol on various schedules. Psychopharmacologia 29: 203-210; Simms, J., Bito-Onon, J., Chatterjee, S. et al. Long-Evans Rats Acquire Operant Self-Administration of 20% Ethanol Without Sucrose Fading. Neuropsychopharmacol 35, 1453-1463 (2010).) The original drinking in the dark demonstrations should also be referenced (Rhodes et al., 2005). Line 154 Theile & Navarro 2014 is a review and not the original demonstration.

      When sex differences in alcohol intake are described, more care should be taken to be clear about whether this is in terms of volume (e.g. ml) or blood alcohol levels (BAC, or at least g/kg as a proxy measure). This distinction was often lost when lick responses were being considered. If licking is similar (assuming a single lick from a male and female brings in a similar volume?), this might mean males and females consume similar volumes, but females due to their smaller size would become more intoxicated so the implications of these details need far closer consideration. What is described as identical in one measure, is not in another.

      While the authors have some previous data on the AIC to DLS pathway, there are many brain regions and pathways impacted by alcohol and so the focus on this one in particular was not strongly justified. Since photometry is really an observational method, it's important to note that no causal link between activity in the pathway and drinking has been established here.

      It would be helpful if the authors could further explain whether their modified lickometers actually measure individual licks. While in some systems contact with the tongue closes a circuit which is recorded, the interruption of a photobeam was used here. It's not clear to me whether the nose close to the spout would be sufficient to interrupt that beam, or whether a tongue protrusion is required. This detail is important for understanding how the photometry data is linked to behaviour. The temporal resolution of the GCaMP signal is likely not good enough to capture individual links but I think more caution or detail in the discussion of the correspondence of these events is required.

      Even if the pattern of drinking differs between males and females, the use of the word "strategy" implies a cognitive process that was never described or measured.

    3. Reviewer #2 (Public Review):

      Summary:

      This study looks at sex differences in alcohol drinking behaviour in a well-validated model of binge drinking. They provide a comprehensive analysis of drinking behaviour within and between sessions for males and females, as well as looking at the calcium dynamics in neurons projecting from the anterior insula cortex to the dorsolateral striatum.

      Strengths:

      Examining specific sex differences in drinking behaviour is important. This research question is currently a major focus for preclinical researchers looking at substance use. Although we have made a lot of progress over the last few years, there is still a lot that is not understood about sex-differences in alcohol consumption and the clinical implications of this.

      Identifying the lateralisation of activity is novel, and has fundamental importance for researchers investigating functional anatomy underlying alcohol-driven behaviour (and other reward-driven behaviours).

      Weaknesses:

      Very small and unequal sample sizes, especially females (9 males, 5 females). This is probably ok for the calcium imaging, especially with the G-power figures provided, however, I would be cautious with the outcomes of the drinking behaviour, which can be quite variable.

      For female drinking behaviour, rather than this being labelled "more efficient", could this just be that female mice (being substantially smaller than male mice) just don't need to consume as much liquid to reach the same g/kg. In which case, the interpretation might not be so much that females are more efficient, as that mice are very good at titrating their intake to achieve the desired dose of alcohol.

    4. Reviewer #3 (Public Review):

      Summary:

      In this manuscript by Haggerty and Atwood, the authors use a repeated binge drinking paradigm to assess how water and ethanol intake changes in male in female mice as well as measure changes in anterior insular cortex to dorsolateral striatum terminal activity using fiber photometry. They find that overall, males and females have similar overall water and ethanol intake, but females appear to be more efficient alcohol drinkers. Using fiber photometry, they show that the anterior insular cortex (AIC) to dorsolateral striatum projections (DLS) projections have sex, fluid, and lateralization differences. The male left circuit was most robust when aligned to ethanol drinking, and water was somewhat less robust. Male right, and female and left and right, had essentially no change in photometry activity. To some degree, the changes in terminal activity appear to be related to fluid exposure over time, as well as within-session differences in trial-by-trial intake. Overall, the authors provide an exhaustive analysis of the behavioral and photometric data, thus providing the scientific community with a rich information set to continue to study this interesting circuit. However, although the analysis is impressive, there are a few inconsistencies regarding specific measures (e.g., AUC, duration of licking) that do not quite fit together across analytic domains. This does not reduce the rigor of the work, but it does somewhat limit the interpretability of the data, at least within the scope of this single manuscript.

      Strengths:

      - The authors use high-resolution licking data to characterize ingestive behaviors.<br /> - The authors account for a variety of important variables, such as fluid type, brain lateralization, and sex.<br /> - The authors provide a nice discussion on how this data fits with other data, both from their laboratory and others'.<br /> - The lateralization discovery is particularly novel.

      Weaknesses:

      - The volume of data and number of variables provided makes it difficult to find a cohesive link between data sets. This limits interpretability.<br /> - The authors describe a clear sex difference in the photometry circuit activity. However, I am curious about whether female mice that drink more similarly to males (e.g., less efficiently?) also show increased activity in the left circuit, similar to males. Oppositely, do very efficient males show weaker calcium activity in the circuit? Ultimately, I am curious about how the circuit activity maps to the behaviors described in Figures 1 and 2.<br /> - What does the change in water-drinking calcium imaging across time in males mean? Especially considering that alcohol-related signals do not seem to change much over time, I am not sure what it means to have water drinking change.

    5. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      This paper uses a model of binge alcohol consumption in mice to examine how the behaviour and its control by a pathway between the anterior insular cortex (AIC) to the dorsolateral striatum (DLS) may differ between males and females. Photometry is used to measure the activity of AIC terminals in the DLS when animals are drinking and this activity seems to correspond to drink bouts in males but not females. The effects appear to be lateralized with inputs to the left DLS being of particular interest. 

      Strengths: 

      Increasing alcohol intake in females is of concern and the consequences for substance use disorder and brain health are not fully understood, so this is an area that needs further study. The attempt to link fine-grained drinking behaviour with neural activity has the potential to enrich our understanding of the neural basis of behaviour, beyond what can be gleaned from coarser measures of volumes consumed etc. 

      Weaknesses: 

      The introduction to the drinking in the dark (DID) paradigm is rather narrow in scope (starting line 47). This would be improved if the authors framed this in the context of other common intermittent access paradigms and gave due credit to important studies and authors that were responsible for the innovation in this area (particularly studies by Wise, 1973 and returned to popular use by Simms et al 2010 and related papers; e.g., Wise RA (1973). Voluntary ethanol intake in rats following exposure to ethanol on various schedules. Psychopharmacologia 29: 203-210; Simms, J., Bito-Onon, J., Chatterjee, S. et al. Long-Evans Rats Acquire Operant Self-Administration of 20% Ethanol Without Sucrose Fading. Neuropsychopharmacol 35, 1453-1463 (2010).)

      We appreciate the reviewer’s perspective on the history of the alcohol research field. There are hundreds of papers that could be cited regarding all the numerous different permutations of alcohol drinking paradigms. This study is an eLife “Research Advances” manuscript that is a direct follow-up study to a previously published study in eLife (Haggerty et al., 2022) that focused on the Drinking in the Dark model of binge alcohol drinking. This study must be considered in the context of that previous study (they are linked), and thus we feel that a comprehensive review of the literature is not appropriate for this study.

      The original drinking in the dark demonstrations should also be referenced (Rhodes et al., 2005). Line 154 Theile & Navarro 2014 is a review and not the original demonstration. 

      This is a good recommendation. We have added this citation to Line 33 and changed Line 154.

      When sex differences in alcohol intake are described, more care should be taken to be clear about whether this is in terms of volume (e.g. ml) or blood alcohol levels (BAC, or at least g/kg as a proxy measure). This distinction was often lost when lick responses were being considered. If licking is similar (assuming a single lick from a male and female brings in a similar volume?), this might mean males and females consume similar volumes, but females due to their smaller size would become more intoxicated so the implications of these details need far closer consideration. What is described as identical in one measure, is not in another. 

      As shown in Figure 1, all measures of intake are reported as g/kg for both water and alcohol to assess intakes across fluids that are controlled by body weights. We do not reference changes in fluid volume or BACs to compare differences in measured lickometry or photometric signals, except in one instance where we suggest that the total volume of water (ml) is greater than the total amount of alcohol (ml) consumed in DID sessions, but this applies generally to all animals, regardless of sex, across all the experimental procedures.

      In Figure 2 – Figure Supplement 1 we show drinking microstructures across single DID sessions, and that males and females drink similarly, but not identically, when assessing drinking measures at the smallest timescale that we have the power to detect with the hardware we used for these experiments. Admittedly, the variability seen in these measures is certainly non-zero, and while we are tempted to assume that there exist at least some singular drinks that occur identically between males and females in the dataset that support the idea that females are simply just consuming more volume of fluid per singular drink, we don’t have the sampling resolution to support that claim statistically. Further, even if females did consume more volume per singular drink that males, we do not believe that is enough information to make the claim that such behavior leads to more “intoxication” in females compared males, as we know that alcohol behaviors, metabolism, and uptake/clearance all differ significantly by sex and are contributing factors towards defining an intoxication state. We’ve amended the manuscript to remove any language of referencing these drinking behaviors as identical to clear up the language.

      No conclusions regarding the photometry results can be drawn based on the histology provided. Localization and quantification of viral expression are required at a minimum to verify the efficacy of the dual virus approach (the panel in Supplementary Figure 1 is very small and doesn't allow terminals to be seen, and there is no quantification). Whether these might differ by sex is also necessary before we can be confident about any sex differences in neural activity. 

      We provide hit maps of our fiber placements and viral injection centers, as we have, and many other investigators do regularly for publication based on histological verification. Figure 1A clearly shows the viral strategy taken to label AIC to DLS projections with GCaMP7s, and a representative image shows green GCaMP positive terminals below the fiber placement. Considering the experiments, animals without proper viral expression did not display or had very little GCaMP signal, which also serves as an additional expression-based control in addition to typical histology performed to confirm “hits”. These animals with poor expression or obvious misplacement of the fiber probes were removed as described in the methods. Further, we also report our calcium signals as z-scored differences in changes in observed fluorescence, thus we are comparing scaled averages of signals across sexes, and days, which helps minimize any differences between “low” or “high” viral transduction levels at the terminals, directly underneath the tips of the fibers.

      While the authors have some previous data on the AIC to DLS pathway, there are many brain regions and pathways impacted by alcohol and so the focus on this one in particular was not strongly justified. Since photometry is really an observational method, it's important to note that no causal link between activity in the pathway and drinking has been established here. 

      As mentioned above, this article is an eLife Research Advances article that builds on our previous AIC to DLS work published in eLife (Haggerty et al., 2022). Considering that this is a linked article, a justification for why this brain pathway was chosen is superfluous. In addition, an exhaustive review of all the different brain regions and pathways that are affected by binge alcohol consumption to justify this pathway seems more appropriate to a review article than an article such as this.  

      We make no claims that photometric recordings are anything but observational, but we did observe these signals to be different when time-locked to the beginning of drinking behaviors. We describe this link between activity in the pathway and drinking throughout the manuscript. It is indeed correlational, but just because it is not causal does not mean that our findings are invalid or unimportant.

      It would be helpful if the authors could further explain whether their modified lickometers actually measure individual licks. While in some systems contact with the tongue closes a circuit which is recorded, the interruption of a photobeam was used here. It's not clear to me whether the nose close to the spout would be sufficient to interrupt that beam, or whether a tongue protrusion is required. This detail is important for understanding how the photometry data is linked to behaviour. The temporal resolution of the GCaMP signal is likely not good enough to capture individual links but I think more caution or detail in the discussion of the correspondence of these events is required. 

      The lickometers do not capture individual licks, but a robust quantification of the information they capture is described in Godynyuk et al. 2019 and referenced in multiple other papers (Flanigan et al. 2023, Haggerty et al. 2022, Grecco et al. 2022, Holloway et al. 2023) where these lickometers have been used. However, individual lick tracking is not a requirement for tracking drinking behaviors more generally. The lickometers used clearly track when the animals are at the bottles, drinking fluids, and we have used the start of that lickometer signal to time-lock our photometry signals to drinking behaviors. We make no claims or have any data on how photometric signals may be altered on timescales of single licks. In regard to how AIC to DLS signals change on the second time scale when animals initiate drinking behaviors, we believe we explain these signals with caution and in context of the behaviors they aim to describe.

      Even if the pattern of drinking differs between males and females, the use of the word "strategy" implies a cognitive process that was never described or measured. 

      We use the word strategy to describe a plan of action that is executed by some chunking of motor sequences that amounts to a behavioral event, in this case drinking a fluid. We do not mean to imply anything further than this by using this specific word.

      Reviewer #2 (Public Review): 

      Summary: 

      This study looks at sex differences in alcohol drinking behaviour in a well-validated model of binge drinking. They provide a comprehensive analysis of drinking behaviour within and between sessions for males and females, as well as looking at the calcium dynamics in neurons projecting from the anterior insula cortex to the dorsolateral striatum. 

      Strengths: 

      Examining specific sex differences in drinking behaviour is important. This research question is currently a major focus for preclinical researchers looking at substance use. Although we have made a lot of progress over the last few years, there is still a lot that is not understood about sex-differences in alcohol consumption and the clinical implications of this. 

      Identifying the lateralisation of activity is novel, and has fundamental importance for researchers investigating functional anatomy underlying alcohol-driven behaviour (and other reward-driven behaviours). 

      Weaknesses: 

      Very small and unequal sample sizes, especially females (9 males, 5 females). This is probably ok for the calcium imaging, especially with the G-power figures provided, however, I would be cautious with the outcomes of the drinking behaviour, which can be quite variable. 

      For female drinking behaviour, rather than this being labelled "more efficient", could this just be that female mice (being substantially smaller than male mice) just don't need to consume as much liquid to reach the same g/kg. In which case, the interpretation might not be so much that females are more efficient, as that mice are very good at titrating their intake to achieve the desired dose of alcohol. 

      We agree that the “more efficient” drinking language could be bolstered by additional discussion in the text, and thus have added this to the manuscript starting at line 440.

      I may be mistaken, but is ANCOVA, with sex as the covariate, the appropriate way to test for sex differences? My understanding was that with an ANCOVA, the covariate is a continuous variable that you are controlling for, not looking for differences in. In that regard, given that sex is not continuous, can it be used as a covariate? I note that in the results, sex is defined as the "grouping variable" rather than the covariate. The analysis strategy should be clarified. 

      In lines 265-267, we explicitly state that the covariate factor was sex, which is mathematically correct based on the analyses we ran. We made an in-text error where we referred to sex as a grouping variable on Line 352, when it should have been the covariate. Thank you for the catch and we have corrected the manuscript.

      But, to reiterate, we are attempting to determine if the regression fits by sex are significantly different, which would be reported as a significant covariate. Sex is certainly a categorical variable, but the two measures at which we are comparing them against are continuous, so we believe we have the validity to run an ANCOVA here.

      Reviewer #3 (Public Review): 

      Summary: 

      In this manuscript by Haggerty and Atwood, the authors use a repeated binge drinking paradigm to assess how water and ethanol intake changes in male in female mice as well as measure changes in anterior insular cortex to dorsolateral striatum terminal activity using fiber photometry. They find that overall, males and females have similar overall water and ethanol intake, but females appear to be more efficient alcohol drinkers. Using fiber photometry, they show that the anterior insular cortex (AIC) to dorsolateral striatum projections (DLS) projections have sex, fluid, and lateralization differences. The male left circuit was most robust when aligned to ethanol drinking, and water was somewhat less robust. Male right, and female and left and right, had essentially no change in photometry activity. To some degree, the changes in terminal activity appear to be related to fluid exposure over time, as well as within-session differences in trial-by-trial intake. Overall, the authors provide an exhaustive analysis of the behavioral and photometric data, thus providing the scientific community with a rich information set to continue to study this interesting circuit. However, although the analysis is impressive, there are a few inconsistencies regarding specific measures (e.g., AUC, duration of licking) that do not quite fit together across analytic domains. This does not reduce the rigor of the work, but it does somewhat limit the interpretability of the data, at least within the scope of this single manuscript. 

      Strengths: 

      - The authors use high-resolution licking data to characterize ingestive behaviors. 

      - The authors account for a variety of important variables, such as fluid type, brain lateralization, and sex. 

      - The authors provide a nice discussion on how this data fits with other data, both from their laboratory and others'. 

      - The lateralization discovery is particularly novel. 

      Weaknesses: 

      - The volume of data and number of variables provided makes it difficult to find a cohesive link between data sets. This limits interpretability.

      We agree there is a lot of data and variables within the study design, but also believe it is important to display the null and positive findings with each other to describe the changes we measured wholistically across water and alcohol drinking.

      - The authors describe a clear sex difference in the photometry circuit activity. However, I am curious about whether female mice that drink more similarly to males (e.g., less efficiently?) also show increased activity in the left circuit, similar to males. Oppositely, do very efficient males show weaker calcium activity in the circuit? Ultimately, I am curious about how the circuit activity maps to the behaviors described in Figures 1 and 2. 

      In Figure 3C, we show that across the time window of drinking behaviors, that female mice who drink alcohol do have a higher baseline calcium activity compared to water drinking female mice, so we believe there are certainly alcohol induced changes in AIC to DLS within females, but there remains to be a lack of engagement (as measured by changes in amplitude) compared to males. So, when comparing consummatory patterns that are similar by sex, we still see the lack of calcium signaling near the drinking bouts, but small shifts in baseline activity that we aren’t truly powered to resolve (using an AUC or similar measurements for quantification) because the shifts are so small. Ultimately, we presume that the AIC to DLS inputs in females aren’t the primary node for encoding this behavior, and some recent work out of David Werner’s group (Towner et al. 2023) suggests that for males who drink, the AIC becomes a primary node of control, whereas in females, the PFC and ACC, are more engaged. Thus, the mapping of the circuit activity onto the drinking behaviors more generally represented in Figures 1 and 2 may be sexually dimorphic and further studies will be needed to resolve how females engage differential circuitry to encode ongoing binge drinking behaviors.

      - What does the change in water-drinking calcium imaging across time in males mean? Especially considering that alcohol-related signals do not seem to change much over time, I am not sure what it means to have water drinking change. 

      The AIC seems to encode many physiologically relevant, interoceptive signals, and the water drinking in males was also puzzling to us as well. Currently, we think it may be both the animals becoming more efficient at drinking out of the lickometers in early weeks and may also be signaling changes due to thirst states of taste associated with the fluid. While this is speculation, we need to perform more in-depth studies to determine how thirst states or taste may modulate AIC to DLS inputs, but we believe that is beyond the scope of this current study.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Line 45 - states alcohol use rates are increasing in females across the past half-decade. I thought this trend was apparent over the past half-century? Please consider revising this. 

      According to NIAAA, the rates of alcohol consumption in females compares to males has been closing for about the past 100 years now, but only recently are those trends starting to reverse, where females are drinking similar amounts or more than males.

      Placing more of the null findings into supplemental data would make the long paper more accessible to the reader. 

      In reference to reviewer’s three’s point as well, there is a lot of data we present, and we hope for others to use this data, both null and positive findings in their future work. As formatted on eLife’s website, we think it is important to place these findings in-line as well.

      Reviewer #2 (Recommendations For The Authors): 

      In addition to the points raised about analysis and interpretation in the Public Review, I have a minor concern about the written content. I find the final sentence of the introduction "together these findings represent targets for future pharmacotherapies.." a bit unjustified and meaningless. The findings are important for a basic understanding of alcohol drinking behaviour, but it's unclear how pharmacotherapies could target lateralised aic inputs into dls. 

      There are on-going studies (CANON-Pilot Study, BRAVE Lab, Stanford) for targeted therapies that use technologies like TMS and focused ultrasound to activate the AIC to alleviate alcohol cravings and decrease heavy drinking days. The difficulty with these next-generation therapeutics is often targeting, and thus we think this work may be of use to those in the clinic to further develop these treatments. We agree that this data does not support the development of pharmacotherapies in a traditional sense, and thus have removed the word and added text to reference TMS and ultrasound approaches to bolster this statement in lines 101+.

    1. eLife assessment

      This paper introduces an efficient approach to identify subunits in the receptive fields of retinal ganglion cells. The general approach has been used in this application previously and this limits the conceptual advance of the paper. The improved speed is valuable, as it allows a more thorough exploration of the control parameters in this analysis and facilitates application to larger populations of cells. Validation of the approach is convincing. The paper would benefit from a more thorough exploration of the method and its limitations, or an extension of the new results about subunit populations.

    2. Reviewer #1 (Public Review):

      Summary:

      This paper introduces an efficient approach to infer properties of receptive-field subunits from the ensemble of spike-triggered stimuli. This is an important general problem in sensory coding. The results introduced in the paper make a solid contribution to both how subunits can be identified and how subunits of different types are coordinated in space.

      Strengths:

      A primary strength of the paper is the development of approaches that substantially speed non-negative matrix factorization and by doing so create an opportunity for a more systematic exploration of how the procedure depends on various control parameters. The improved procedure is well documented and the direct comparisons with previous procedures are helpful. The improved efficiency enabled several improvements in the procedure - notably tests of good procedures for initializing NNMF and tests of the dependence of the results on the sparsity regularization parameter.

      A second strength of the paper is the exploration of the spatial relationship between different subunits. This, to my knowledge, is new and is an interesting direction. There are some concerns about this analysis (see weaknesses below), but if this analysis can be strengthened it will provide new information that will be important both functionally and developmentally.

      Weaknesses:

      A primary concern is that choices made about parameters for several aspects of the analysis appear to be made subjectively. Much of this centers around how much of the structure in the extracted subunits is imposed by the procedure itself, and how much reflects the underlying neural circuitry. Some specific issues related to this concern are:

      - Sparsity: the use of the autocorrelation function to differentiate real vs spurious subunits should be documented and validated. For example, can the authors split data in half and show that the real subunits are stable?

      - Choice of regularization: the impact of the regularization parameter on subunit properties is nicely documented. However, the choice of an appropriate regularization parameter seems somewhat arbitrary. Line 253-256 is an example of this problem: this sentence sounds circular - as if the sparsity factor was turned up until the authors obtained what they expected to obtain. Could the choice of this parameter significantly impact the properties of the extracted subunits? How sensitive are the subunit properties to that parameter? Some additional control analyses are needed to validate the parameter choice (see the crossvalidation comment below).

      - Crossvalidation was not used to identify the regularization constraint value because the weight matrix from NNMF does not generalize beyond the data it was fit to. Could the authors instead hold the components matrix fixed and recompute the weight matrix, and use that approach for cross-validation (especially since it is really the components matrix that needs validating)?

      The paper would benefit from a more complete comparison with known anatomy. For example, can the authors estimate the number of cones within each subunit? This is well-constrained both anatomically (at least in macaque) and, especially for midget ganglion cell subunits, functionally. In macaque, most midget bipolar cells get input from single cones, so the number of extracted subunits should be close to the number of cones. This would be a useful point of comparison for the current work.

      Is the analysis of the spatial relationship between different subunit mosaics robust to the incompleteness of those mosaics? The argument on lines 496-503 should be backed up by more analysis. For example, if subunits are removed from regions where the mosaic is pretty complete, do the authors change the spatial dependence? Alternatively, could they use synthetic mosaics with properties like those measured to check the sensitivity to missing cells?

      NNMF relies on accounting for each spike-triggered stimulus with a linear combination of components. Would nonlinearities - e.g. those in the bipolar cell outputs - substantially change the results?

      Does the approach work for cells that receive input from multiple bipolar types? Some ganglion cells, e.g. in mice, receive input from multiple bipolar types, each accounting for a sizable percentage of the total input. There is similar anatomical work indicating that parasol cells may receive input from multiple diffuse bipolar types. It is not clear whether the current approach works in cases where the subunits of a single ganglion cell overlap. Some discussion of this would be useful.

    3. Reviewer #2 (Public Review):

      Summary:

      Identifying spatial subunits within the receptive field of retinal ganglion cells can help study spatial nonlinearities and upstream computations performed by the bipolar cells. The authors significantly accelerate the implementation of the previously proposed Spike Triggered semi-non-negative Matrix Factorization (STNMF) method to identify the subunits. The authors also propose a few method improvements - better initialization; new stability-based criteria for selecting the regularization strength, and hyperparameter selection across cell types.

      The authors then apply this new method to RGC populations in both the salamander retina and the macaque (marmoset) retina. The authors document the subunit sizes, numbers, and overlap across cell types. The neuroscience finding describes the anti-coordination of ON and OFF parasol receptive fields, but not for the corresponding subunits.

      Overall, the authors claim that a faster and more accurate method makes scale-up to large neuronal populations feasible.

      Strengths:

      - The paper is well-written, easy to read and the figures are clear. The limitations are also made clear.

      - The scientific findings are novel and seem to be well supported.

      - The claimed speed-up of the method is potentially important for practical applications to large populations. Each innovation of the method is well-supported.

      - This is a serious effort to improve the method and document the subunits in primate retina.

      Weaknesses:

      - The description of the method is confusing. Currently, the new method is described in the context of changes from existing methods. As someone who is not familiar with previous methods, it is very confusing to follow the details.

      - I think it will help a lot with clarity to have a concise flowchart/pseudocode to summarize the algorithm and separate it from a description of the main changes from previous methods.

      - Separate pseudocodes can be provided for the main method, initialization, regularization parameter selection using consensus, and identifying the regularization parameter across cell types.

      - While the new method clearly shows a drastic improvement compared to the previous method on a laptop, would it be possible to get the same improvement on the previous method if it was implemented with GPU (as is standard for most AI/ML algorithms)?

      - For the calculation of subunits across multiple cells, can you run multiple parallel jobs on the same computer? This may make some innovations unnecessary (like setting the same regularization strength across multiple cells).

      - There are two main innovations in this paper: the fast and approximate method, and analysis of subunit mosaics for primate RGCs. It would be helpful to include an analysis of the primate RGC subunits using the older, slower, but more exact method and show that the major scientific results can be reproduced. This would validate the new method in an end-to-end manner. While this may take a while to run, it may be helpful in the supplement.

      - It would be important to understand the data-efficiency of the method. The approximate method may deviate more from the exact method when the amount of data is limited.

      - Would it be possible to have a few steps of the exact method at the end to ensure that the solution truly optimizes the objective function?

      - Does the number of estimated subunits change with the number of observed spikes? If so, the estimates of subunit number/size must be interpreted with caution.

    4. Reviewer #3 (Public Review):

      Summary:

      This work addresses the problem of determining the subunit composition of receptive fields of retinal ganglion cells (RGCs). RGCs process stimuli through non-linear transforms that largely (although not entirely) reflect the individual contributions of their input bipolar cells, which themselves process visual stimuli nonlinearly. Thus, using the correct system identification methods might correctly model the RGC cells, while revealing details of the underlying circuit, including the function of the presynaptic components. It is now well established that a model of the form of an LNLN cascade can potentially capture this bipolar-RGC circuit, although the devil is in the details. The authors present an improved method of non-negative matrix factorization (NMF) - which is one approach to this system identification problem - that can speed things up by a factor of 100, and in doing so infer plausible mosaics of the bipolar cell types supporting the identified RGC types that are recorded from.

      As written, the focus of this paper seems almost entirely methodological, supporting the sped-up version of NMF, called STNMF. The >100x speedup potentially makes a lot more measurements available, since it enables much more comprehensive scans across model meta-parameters, although has its own complications that must also be methodologically addressed. The results presented are largely a demonstration and validation of the potential power of this approach using example recordings in the peripheral marmoset retina. I do not think the results themselves are meant to be evaluated as definitive, since they are often based on examples and are largely confirmatory of what is already known.

      Strengths:

      I have very few concerns about the paper methodologically: these methods are well laid out and demonstrated (at least up to the level of my expertise and interest), including validation with established literature.

      I am also enthusiastic about some of the potential results in the retina outlined (but not fully fleshed out) in the later sections of the paper.

      Weaknesses:

      My main critique is to question the conceptual advance in this paper: what did we learn, and what is the targeted audience of interest? Establishing this is particularly dire for this manuscript since NMF has already been established and expounded on as a useful approach in this context (including by the author most recently in 2017) so any of the scientific results is already achievable with enough computer power using existing approaches. As currently cast, the conceptual advances here are purely methodological and relate to the utility of speeding up the approach. Also, they do not appear to generalize to other problems outside of the narrow range that it is currently applied.

      Thus, two paths to improving the manuscript would be either:<br /> (1) target readers interested in the retina by fully fleshing out the current results and add more to make this into a paper about the retina rather than about the STNMF method, or<br /> (2) demonstrate that the methods might be useful outside of the very narrow set of conditions specific to identifying nonlinear bipolar cell subunits in peripheral retina under white noise stimulation.

      In its current state, the Discussion addressing limitations and generality seems to suggest applicability past this narrow condition, which I do not think is the case: but would be happy to be convinced otherwise.

      For fleshing out scientific results, in the current manuscript, they are currently presented to validate the approach and are largely confirmatory for what we already know about the retina (which allows for this validation). Also, much of the results are measurements based on examples, and not accumulated past a single recording in some cases. Finally, it is not clear to the extent that these results depend on the specific recordings in the peripheral marmoset retina: what about more central in the retina, or in other species?

      For demonstrating the utility of the methodology: here are some of the main limitations to generalizing past this specific case:<br /> (1) the necessity of linear or near-linear processing in previous layers;<br /> (2) lack of any negative components;<br /> (3) lack of ability to account for other influences on spiking than the positive contributions of LN subunits;<br /> (4) necessity of white noise stimulation that is specifically sized for a uniform subunit size.

      Together, I believe this precludes potential applications to other areas in the brain: further back in the visual system will require non-linear transforms as well as the convergence of positive and negative inputs. Other sensory systems like the auditory system are even more non-linear well before getting to even mid-level pre-cortical structures and also combine positive and negative influences. Given the importance of inhibition in the retina (including what is thought to be an important role of amacrine cells in shaping RGC responses), it is not clear how general this approach is in the retina, although the specific results shown are believable. How could this approach generalize, realistically? Could applications to other types of data be demonstrated, and/or plausibly get by these fundamental limitations? How?

    1. eLife assessment

      This valuable study examines the role of the interaction between cytoplasmic N- and C-terminal domains in voltage-dependent gating of Kv10.1 channels. The authors suggest that they have identified a hidden open state in Kv10.1 mutant channels, thus providing a window for observing early conformational transitions associated with channel gating. The evidence supporting the major conclusions is solid, but additional work is required to determine the molecular mechanism underlying the observations in this study. Learning the molecular mechanisms could be significant in understanding the gating mechanisms of the KCNH family and will appeal to biophysicists interested in ion channels and physiologists interested in cancer biology.

    2. Gating of Kv10 channels is unique because it involves coupling between non-domain swapped voltage sensing domains, a domain-swapped cytoplasmic ring assembly formed by the N- and C-termini, and the pore domain. Recent structural data suggests that activation of the voltage sensing domain relieves a steric hindrance to pore opening, but the contribution of the cytoplasmic domain to gating is still not well understood. This aspect is of particular importance because proteins like calmodulin interact with the cytoplasmic domain to regulate channel activity. The effects of calmodulin (CaM) in WT and mutant channels with disrupted cytoplasmic gating ring assemblies are contradictory, resulting in inhibition or activation, respectively. The underlying mechanism for these discrepancies is not understood. In the present manuscript, Reham Abdelaziz and collaborators use electrophysiology, biochemistry and mathematical modeling to describe how mutations and deletions that disrupt inter-subunit interactions at the cytoplasmic gating ring assembly affect Kv10.1 channel gating and modulation by CaM. In the revised manuscript, additional information is provided to allow readers to identify within the Kv10.1 channel structure the location of E600R, one of the key channel mutants analyzed in this study. However, the mechanistic role of the cytoplasmic domains that this study focuses on, as well as the location of the ΔPASCap deletion and other perturbations investigated in the study remain difficult to visualize without additional graphical information.

      The authors focused mainly on two structural perturbations that disrupt interactions within the cytoplasmic domain, the E600R mutant and the ΔPASCap deletion. By expressing mutants in oocytes and recording currents using Two Electrode Voltage-Clamp (TEV), it is found that both ΔPASCap and E600R mutants have biphasic conductance-voltage (G-V) relations and exhibit activation and deactivation kinetics with multiple voltage-dependent components. Importantly, the mutant-specific component in the G-V relations is observed at negative voltages where WT channels remain closed. The authors argue that the biphasic behavior in the G-V relations is unlikely to result from two different populations of channels in the oocytes, because they found that the relative amplitude between the two components in the G-V relations was highly reproducible across individual oocytes that otherwise tend to show high variability in expression levels. Instead, the G-V relations for all mutant channels could be well described by an equation that considers two open states O1 and O2, and a transition between them; O1 appeared to be unaffected by any of the structural manipulations tested (i.e. E600R, ΔPASCap, and other deletions) whereas the parameters for O2 and the transition between the two open states were different between constructs. The O1 state is not observed in WT channels and is hypothesized to be associated with voltage sensor activation. O2 represents the open state that is normally observed in WT channels and is speculated to be associated with conformational changes within the cytoplasmic gating ring that follow voltage sensor activation, which could explain why the mutations and deletions disrupting cytoplasmic interactions affect primarily O2.

      Severing the covalent link between the voltage sensor and pore reduced O1 occupancy in one of the deletion constructs. Although this observation is consistent with the hypothesis that voltage-sensor activation drives entry into O1, this result is not conclusive. Structural as well as functional data has established that the coupling of the voltage sensor and pore does not entirely rely on the S4-S5 covalent linker between the sensor and the pore, and thus the severed construct could still retain coupling through other mechanisms, which is consistent with the prominent voltage dependence that is observed. If both states O1 and O2 require voltage sensor activation, it is unclear why the severed construct would affect state O1 primarily, as suggested in the manuscript, as opposed to decreasing occupancy of both open states. In line with this argument, the presence of Mg2+ in the extracellular solution affected both O1 and O2. This finding suggests that entry into both O1 and O2 requires voltage-sensor activation because Mg2+ ions are known to stabilize the voltage sensor in its most deactivated conformations.

      Activation towards and closure from O1 is slow, whereas channels close rapidly from O2. A rapid alternating pulse protocol was used to take advantage of the difference in activation and deactivation kinetics between the two open components in the mutants and thus drive an increasing number of channels towards state O1. Currents activated by the alternating protocol reached larger amplitudes than those elicited by a long depolarization to the same voltage. This finding is interpreted as an indication that O1 has a larger macroscopic conductance than O2. In the revised manuscript, the authors performed single-channel recordings to determine why O1 and O2 have different macroscopic conductance. The results show that at voltages where the state O1 predominates, channels exhibited longer open times and overall higher open probability, whereas at more depolarized voltages where occupancy of O2 increases, channels exhibited more flickery gating behavior and decreased open probability. These results are informative but not conclusive since single-channel amplitudes could not be resolved at strong depolarizations, limiting the extent to which the data could be analyzed. In the last revision, the authors have included one representative example showing inhibition of single channel activity by the Kv10-specific inhibitor astemizole. Group data analysis would be needed to conclusively establish that the currents that were recorded indeed correspond to Kv10 channels.

      It is shown that conditioning pulses to very negative voltages result in mutant channel currents that are larger and activate more slowly than those elicited at the same voltage but starting from less negative conditioning pulses. In voltage-activated curves, O1 occupancy is shown to be favored by increasingly negative conditioning voltages. This is interpreted as indicating that O1 is primarily accessed from deeply closed states in which voltage sensors are in their most deactivated position. Consistently, a mutation that destabilizes these deactivated states is shown to largely suppress the first component in voltage-activation curves for both ΔPASCap and E600R channels.

      The authors then address the role of the hidden O1 state in channel regulation by calcium-calmodulin (CaM). Stimulating calcium entry into oocytes with ionomycin and thapsigargin, assumed to enhance CaM-dependent modulation, resulted in preferential potentiation of the first component in ΔPASCap and E600R channels. This potentiation was attenuated by including an additional mutation that disfavors deeply closed states. Together, these results are interpreted as an indication that calcium-CaM preferentially stabilizes deeply closed states from which O1 can be readily accessed in mutant channels, thus favoring current activation. In WT channels lacking a conducting O1 state, CaM stabilizes deeply closed states and is therefore inhibitory. It is found that the potentiation of ΔPASCap and E600R by CaM is more strongly attenuated by mutations in the channel that are assumed to disrupt interaction with the C-terminal lobe of CaM than mutations assumed to affect interaction with the N-terminal lobe. These results are intriguing but difficult to interpret in mechanistic terms. The strong effect that calcium-CaM had on the occupancy of the O1 state in the mutants raises the possibility that O1 can be only observed in channels that are constitutively associated with CaM. To address this, a biochemical pull-down assay was carried out to establish that only a small fraction of channels are associated with CaM under baseline conditions. These CaM experiments are potentially very interesting and could have wide physiological relevance. However, the approach utilized to activate CaM is indirect and could result in additional non-specific effects on the oocytes that could affect the results.

      Finally, a mathematical model is proposed consisting of two layers involving two activation steps for the voltage sensor, and one conformational change in the cytoplasmic gating ring - completion of both sets of conformational changes is required to access state O2, but accessing state O1 only requires completion of the first voltage-sensor activation step in the four subunits. The model qualitatively reproduces most major findings on the mutants. Although the model used is highly symmetric and appears simple, the mathematical form used for the rate constants in the model adds a layer of complexity to the model that makes mechanistic interpretations difficult. In addition, many transitions that from a mechanistic standpoint should not depend on voltage were assigned a voltage dependence in the model. These limitations diminish the mechanistic insight that can be reliably extracted from the model.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      We appreciate the feedback provided and refer to our previous response for detailed explanations regarding our decisions on some of the recommendations made by the referees and editors. We have introduced changes as follows:

      • We added a supplementary Figure to Figure 5 to show inhibition by Astemizole at the single channel level.

      • We have corrected Figure 7A, where the normalized current did not reach 1 as a maximum. We had overlooked that this is expected when the prepulse was -160 mV, and the IV is strongly biphasic, but not when coming from -100 mV. We are thankful for this observation, which served to identify that the values for one of the cells were inverted with respect to the others (the sequence of stimuli was different during recording, and this information got lost in the analysis procedure). We have corrected this and made sure that such a mistake had not happened anywhere else.

      • Finally, we have corrected a typo in the discussion, as indicated in the review.

      We include a version with changes marked and a clean version of the manuscript.

    1. eLife assessment

      This important study utilizes a comprehensive array of animal and cellular models, alongside various techniques, to elucidate the mechanism by which adipose tissue miR-802 contributes to inflammation and metabolic dysfunction in obesity. The data is solid, with clear, reproducible changes showing low variability among biological replicates and consistency across different models. However, some conclusions should be further substantiated with additional data to enhance the scope and strength of the manuscript.

    2. Reviewer #1 (Public Review):

      In this manuscript, Yang et al. conduct a comprehensive investigation to demonstrate the role of adipose tissue miR-802 in obesity-associated inflammation and metabolic dysfunction. Using multiple models and techniques, they propose a mechanism where elevated levels of miR-802 in adipose tissue (both in mouse models and humans) trigger fat accumulation and inflammation, leading to increased adiposity and insulin resistance. They suggest that increased miR-802 levels in adipocytes during obesity result in the downregulation of TRAF3, a negative regulator of canonical and non-canonical NF-κB pathways. This downregulation induces inflammation through the production of cytokines/chemokines that attract and polarize macrophages. Concurrently, the NF-κB pathway induces the lipogenic transcriptional factor SREBP1, which promotes fat accumulation and further recruits pro-inflammatory macrophages. While the proposed model is supported by multiple experiments and consistent data, there are areas where the manuscript could be improved. Some improvements can be addressed in the text, while others require additional controls, experiments, or analyses.

      (1) The manuscript should provide measurements of lipid droplet/adipocyte size for all models, both in vitro and in vivo. In vivo studies should also include fat weight measurements. This is crucial to determine whether miR-802, TRAF3, and SREBP1 promote adiposity/fat accumulation across all models.<br /> (2) The rationale for co-culture experiments using WAT SVF is unclear, given that miR-802 is upregulated by obesity in adipocytes, not in the stromal-vascular fraction. These experiments would be more relevant if performed using isolated adipocytes or differentiated WAT SVF.<br /> (3) Figures 1G and 1H lack a control group (time 0 or NCD). Without this control, it is impossible to determine if inflammation precedes miR-802 upregulation.<br /> (4) The statement, "The knockout of miR-802 in adipose tissue did not alter food intake, body weight, glucose level, and adiposity (data not shown)," needs more detail regarding the age and sex of the animals. These data are important and should be reported, perhaps in a supplementary figure.<br /> (5) The terms "KO" (knockout) and "KI" (knock-in) are misleading for AAV models, as they do not modify the genome. "KD" (knockdown) and "OE" (overexpression) are more accurate.<br /> (6) The statement, "miR-802 expression was unaffected in other organs (Figure S3O)," should clarify that this is except for BAT.

      By addressing these points, the manuscript would present a more robust and clear demonstration of the role of miR-802 in obesity-associated inflammation and metabolic dysfunction.

    3. Reviewer #2 (Public Review):

      Yang et al. investigated the role of miR-802 in the development of adipose tissue (AT) inflammation during obesity. The authors found miR-802 levels are up-regulated in the AT of mouse models of obesity and insulin resistance as well as in the AT of humans. They further demonstrated that miR-802 regulates the intracellular levels of TRAF3 and downstream activation of the NF-kB pathway. Ultimately, controlling AT inflammation by manipulating miR-802 affected whole-body glucose homeostasis, highlighting the role of AT inflammatory status in whole-body metabolism. The study provides solid evidence on the role of adipocyte miR-802 in controlling inflammation and macrophage recruitment. However, how lipid mobilization from adipocytes and how engulfment of lipid droplets by macrophages control inflammatory phenotype in these cells could be better explored. The findings of this study will have a great impact in the field, contributing to the growing body of evidence on how microRNAs control the inflammatory microenvironment of AT and whole-body metabolism in obesity.

    4. Reviewer #3 (Public Review):

      MiR-802 appears to accumulate before macrophage numbers increase in adipose tissue in both mice and humans. The phenotype of miR-802 overexpression and deletion in vivo is sticking and novel. Deletion of miR-802 in adipose tissue after obesity onset also attenuated Adipose inflammation and improved systemic glucose homeostasis. Understanding how miR-802 affects the crosstalk between macrophage and adipocyte is a major point. For example, does miR-802 change the inflammatory of macrophages as it increases Traf3 expression in adipocytes? This is important because macrophages are the input if inflammatory mediators that will activate the TNFR receptor signaling pathway, potentially Traf3, resulting in impaired insulin stimulated Glut4 translocation and glucose uptake. Also, modulation of miR-802 levels in vivo leads to alterations in adiposity. Here, what is a direct effect of miR-802 and what is a result of simply reduced adiposity? One point that os ket is what triggers miR-802 expression, especially in obesity.

    1. eLife assessment

      This important study investigates how memory representations are transformed over time (24h period). The work advances our understanding of the neural processes supporting the behavioral integration of memories for distinct events that are never experienced together in time but are linked by shared predictive cues. Evidence supporting the claims is solid, and reporting of additional comparisons would have strengthened the study.

    2. Reviewer #1 (Public Review):

      In this paper, Tompary & Davachi present work looking at how memories become integrated over time in the brain, and relating those mechanisms to responses on a priming task as a behavioral measure of memory linkage. They find that remotely but not recently formed memories are behaviorally linked and that this is associated with a change in the neural representation in mPFC. They also find that the same behavioral outcomes are associated with the increased coupling of the posterior hippocampus with category-sensitive parts of the neocortex (LOC) during a post-learning rest period-again only for remotely learned information. There was also correspondence in rest connectivity (posterior hippocampus-LOC) and representational change (mPFC) such that for remote memories specifically, the initial post-learning connectivity enhancement during rest related to longer-term mPFC representational change.

      This work has many strengths. The topic of this paper is very interesting, and the data provide a really nice package in terms of providing a mechanistic account of how memories become integrated over a delay. The paper is also exceptionally well-written and a pleasure to read. There are two studies, including one large behavioral study, and the findings replicate in the smaller fMRI sample. I do however have two fairly substantive concerns about the analytic approach, where more data will be required before we can know whether the interpretations are an appropriate reflection of the findings. These and other concerns are described below.

      (1) One major concern relates to the lack of a pre-encoding baseline scan prior to recent learning.

      a) First, I think it would be helpful if the authors could clarify why there was no pre-learning rest scan dedicated to the recent condition. Was this simply a feasibility consideration, or were there theoretical reasons why this would be less "clean"? Including this information in the paper would be helpful for context. Apologies if I missed this detail in the paper.

      b) Second, I was hoping the authors could speak to what they think is reflected in the post-encoding "recent" scan. Is it possible that these data could also reflect the processing of the remote memories? I think, though am not positive, that the authors may be alluding to this in the penultimate paragraph of the discussion (p. 33) when noting the LOC-mPFC connectivity findings. Could there be the reinstatement of the old memories due to being back in the same experimental context and so forth? I wonder the extent to which the authors think the data from this scan can be reflected as strictly reflecting recent memories, particularly given it is relative to the pre-encoding baseline from before the remote memories, as well (and therefore in theory could reflect both the remote + recent). (I should also acknowledge that, if it is the case that the authors think there might be some remote memory processing during the recent learning session in general, a pre-learning rest scan might not have been "clean" either, in that it could have reflected some processing of the remote memories-i.e., perhaps a clean pre-learning scan for the recent learning session related to point 1a is simply not possible.)

      c) Third, I am thinking about how both of the above issues might relate to the authors' findings, and would love to see more added to the paper to address this point. Specifically, I assume there are fluctuations in baseline connectivity profile across days within a person, such that the pre-learning connectivity on day 1 might be different from on day 2. Given that, and the lack of a pre-learning connectivity measure on day 2, it would logically follow that the measure of connectivity change from pre- to post-learning is going to be cleaner for the remote memories. In other words, could the lack of connectivity change observed for the recent scan simply be due to the lack of a within-day baseline? Given that otherwise, the post-learning rest should be the same in that it is an immediate reflection of how connectivity changes as a function of learning (depending on whether the authors think that the "recent" scan is actually reflecting "recent + remote"), it seems odd that they both don't show the same corresponding increase in connectivity-which makes me think it may be a baseline difference. I am not sure if this is what the authors are implying when they talk about how day 1 is most similar to prior investigation on p. 20, but if so it might be helpful to state that directly.

      d) Fourth and very related to my point 1c, I wonder if the lack of correlations for the recent scan with behavior is interpretable, or if it might just be that this is a noisy measure due to imperfect baseline correction. Do the authors have any data or logic they might be able to provide that could speak to these points? One thing that comes to mind is seeing whether the raw post-learning connectivity values (separately for both recent and remote) show the same pattern as the different scores. However, the authors may come up with other clever ways to address this point. If not, it might be worth acknowledging this interpretive challenge in the Discussion.

      (2) My second major concern is how the authors have operationalized integration and differentiation. The pattern similarity analysis uses an overall correspondence between the neural similarity and a predicted model as the main metric. In the predicted model, C items that are indirectly associated are more similar to one another than they are C items that are entirely unrelated. The authors are then looking at a change in correspondence (correlation) between the neural data and that prediction model from pre- to post-learning. However, a change in the degree of correspondence with the predicted matrix could be driven by either the unrelated items becoming less similar or the related ones becoming more similar (or both!). Since the interpretation in the paper focuses on change to indirectly related C items, it would be important to report those values directly. For instance, as evidence of differentiation, it would be important to show that there is a greater decrease in similarity for indirectly associated C items than it is for unrelated C items (or even a smaller increase) from pre to post, or that C items that are indirectly related are less similar than are unrelated C items post but not pre-learning. Performing this analysis would confirm that the pattern of results matches the authors' interpretation. This would also impact the interpretation of the subsequent analyses that involve the neural integration measures (e.g., correlation analyses like those on p. 16, which may or may not be driven by increased similarity among overlapping C pairs). I should add that given the specificity to the remote learning in mPFC versus recent in LOC and anterior hippocampus, it is clearly the case that something interesting is going on. However, I think we need more data to understand fully what that "something" is.

      (3) The priming task occurred before the post-learning exposure phase and could have impacted the representations. More consideration of this in the paper would be useful. Most critically, since the priming task involves seeing the related C items back-to-back, it would be important to consider whether this experience could have conceivably impacted the neural integration indices. I believe it never would have been the case that unrelated C items were presented sequentially during the priming task, i.e., that related C items always appeared together in this task. I think again the specificity of the remote condition is key and perhaps the authors can leverage this to support their interpretation. Can the authors consider this possibility in the Discussion?

      (4) For the priming task, based on the Figure 2A caption it seems as though every sequence contributes to both the control and primed conditions, but (I believe) this means that the control transition always happens first (and they are always back-to-back). Is this a concern? If RTs are changing over time (getting faster), it would be helpful to know whether the priming effects hold after controlling for trial numbers. I do not think this is a big issue because if it were, you would not expect to see the specificity of the remotely learned information. However, it would be helpful to know given the order of these conditions has to be fixed in their design.

      (5) The authors should be cautious about the general conclusion that memories with overlapping temporal regularities become neurally integrated - given their findings in MPFC are more consistent with overall differentiation (though as noted above, I think we need more data on this to know for sure what is going on).

      (6) It would be worth stating a few more details and perhaps providing additional logic or justification in the main text about the pre and post-exposure phases were set up and why. How many times each object was presented pre and post, and how the sequencing was determined (were any constraints put in place e.g., such that C1 and C2 did not appear close in time?). What was the cover task (I think this is important to the interpretation & so belongs in the main paper)? Were there considerations involving the fact that this is a different sequence of the same objects the participants would later be learning - e.g., interference, etc.?

    3. Reviewer #2 (Public Review):

      The manuscript by Tompary & Davachi presents results from two experiments, one behavior only and one fMRI plus behavior. They examine the important question of how to separate object memories (C1 and C2) that are never experienced together in time and become linked by shared predictive cues in a sequence (A followed by B followed by one of the C items). The authors developed an implicit priming task that provides a novel behavioral metric for such integration. They find significant C1-C2 priming for sequences that were learned 24h prior to the test, but not for recently learned sequences, suggesting that associative links between the two originally separate memories emerge over an extended period of consolidation. The fMRI study relates this behavioral integration effect to two neural metrics: pattern similarity changes in the medial prefrontal cortex (mPFC) as a measure of neural integration, and changes in hippocampal-LOC connectivity as a measure of post-learning consolidation. While fMRI patterns in mPFC overall show differentiation rather than integration (i.e., C1-C2 representational distances become larger), the authors find a robust correlation such that increasing pattern similarity in mPFC relates to stronger integration in the priming test, and this relationship is again specific to remote memories. Moreover, connectivity between the posterior hippocampus and LOC during post-learning rest is positively related to the behavioral integration effect as well as the mPFC neural similarity index, again specifically for remote memories. Overall, this is a coherent set of findings with interesting theoretical implications for consolidation theories, which will be of broad interest to the memory, learning, and predictive coding communities.

      Strengths:

      (1) The implicit associative priming task designed for this study provides a promising new tool for assessing the formation of mnemonic links that influence behavior without explicit retrieval demands. The authors find an interesting dissociation between this implicit measure of memory integration and more commonly used explicit inference measures: a priming effect on the implicit task only evolved after a 24h consolidation period, while the ability to explicitly link the two critical object memories is present immediately after learning. While speculative at this point, these two measures thus appear to tap into neocortical and hippocampal learning processes, respectively, and this potential dissociation will be of interest to future studies investigating time-dependent integration processes in memory.

      (2) The experimental task is well designed for isolating pre- vs post-learning changes in neural similarity and connectivity, including important controls of baseline neural similarity and connectivity.

      (3) The main claim of a consolidation-dependent effect is supported by a coherent set of findings that relate behavioral integration to neural changes. The specificity of the effects on remote memories makes the results particularly interesting and compelling.

      (4) The authors are transparent about unexpected results, for example, the finding that overall similarity in mPFC is consistent with a differentiation rather than an integration model.

      Weaknesses:

      (1) The sequence learning and recognition priming tasks are cleverly designed to isolate the effects of interest while controlling for potential order effects. However, due to the complex nature of the task, it is difficult for the reader to infer all the transition probabilities between item types and how they may influence the behavioral priming results. For example, baseline items (BL) are interspersed between repeated sequences during learning, and thus presumably can only occur before an A item or after a C item. This seems to create non-random predictive relationships such that C is often followed by BL, and BL by A items. If this relationship is reversed during the recognition priming task, where the sequence is always BL-C1-C2, this violation of expectations might slow down reaction times and deflate the baseline measure. It would be helpful if the manuscript explicitly reported transition probabilities for each relevant item type in the priming task relative to the sequence learning task and discussed how a match vs mismatch may influence the observed priming effects.

      (2) The choice of what regions of interest to include in the different sets of analyses could be better motivated. For example, even though briefly discussed in the intro, it remains unclear why the posterior but not the anterior hippocampus is of interest for the connectivity analyses, and why the main target is LOC, not mPFC, given past results including from this group (Tompary & Davachi, 2017). Moreover, for readers not familiar with this literature, it would help if references were provided to suggest that a predictable > unpredictable contrast is well suited for functionally defining mPFC, as done in the present study.

      (3) Relatedly, multiple comparison corrections should be applied in the fMRI integration and connectivity analyses whenever the same contrast is performed on multiple regions in an exploratory manner.

    4. Reviewer #3 (Public Review):

      The authors of this manuscript sought to illuminate a link between a behavioral measure of integration and neural markers of cortical integration associated with systems consolidation (post-encoding connectivity, change in representational neural overlap). To that aim, participants incidentally encoded sequences of objects in the fMRI scanner. Unbeknownst to participants, the first two objects of the presented ABC triplet sequences overlapped for a given pair of sequences. This allowed the authors to probe the integration of unique C objects that were never directly presented in the same sequence, but which shared the same preceding A and B objects. They encoded one set of objects on Day 1 (remote condition), another set of objects 24 hours later (recent condition) and tested implicit and explicit memory for the learned sequences on Day 2. They additionally collected baseline and post-encoding resting-state scans. As their measure of behavioral integration, the authors examined reaction time during an Old/New judgement task for C objects depending on if they were preceded by a C object from an overlapping sequence (primed condition) versus a baseline object. They found faster reaction times for the primed objects compared to the control condition for remote but not recently learned objects, suggesting that the C objects from overlapping sequences became integrated over time. They then examined pattern similarity in a priori ROIs as a measure of neural integration and found that participants showing evidence of integration of C objects from overlapping sequences in the medial prefrontal cortex for remotely learned objects also showed a stronger implicit priming effect between those C objects over time. When they examined the change in connectivity between their ROIs after encoding, they also found that connectivity between the posterior hippocampus and lateral occipital cortex correlated with larger priming effects for remotely learned objects, and that lateral occipital connectivity with the medial prefrontal cortex was related to neural integration of remote objects from overlapping sequences.

      The authors aim to provide evidence of a relationship between behavioral and neural measures of integration with consolidation is interesting, important, and difficult to achieve given the longitudinal nature of studies required to answer this question. Strengths of this study include a creative behavioral task, and solid modelling approaches for fMRI data with careful control for several known confounds such as bold activation on pattern analysis results, motion, and physiological noise. The authors replicate their behavioral observations across two separate experiments, one of which included a large sample size, and found similar results that speak to the reliability of the observed behavioral phenomenon. In addition, they document several correlations between neural measures and task performance, lending functional significance to their neural findings.

      However, this study is not without notable weaknesses that limit the strength of the manuscript. The authors report a behavioral priming effect suggestive of integration of remote but not recent memories, leading to the interpretation that the priming effect emerges with consolidation. However, they did not observe a reliable interaction between the priming condition and learning session (recent/remote) on reaction times, meaning that the priming effect for remote memories was not reliably greater than that observed for recent. In addition, the emergence of a priming effect for remote memories does not appear to be due to faster reaction times for primed targets over time (the condition of interest), but rather, slower reaction times for control items in the remote condition compared to recent. These issues limit the strength of the claim that the priming effect observed is due to C items of interest being integrated in a consolidation-dependent manner.

      Similarly, the interactions between neural variables of interest and learning session needed to strongly show a significant consolidation-related effect in the brain were sometimes tenuous. There was no reliable difference in neural representational pattern analysis fit to a model of neural integration between the short and long delays in the medial prefrontal cortex or lateral occipital cortex, nor was the posterior hippocampus-lateral occipital cortex post-encoding connectivity correlation with subsequent priming significantly different for recent and remote memories. While the relationship between integration model fit in the medial prefrontal cortex and subsequent priming (which was significantly different from that occurring for recent memories) was one of the stronger findings of the paper in favor of a consolidation-related effect on behavior, is it possible that lack of a behavioral priming effect for recent memories due to possible issues with the control condition could mask a correlation between neural and behavioral integration in the recent memory condition?

      These limitations are especially notable when one considers that priming does not classically require a period of prolonged consolidation to occur, and prominent models of systems consolidation rather pertain to explicit memory. While the authors have provided evidence that neural integration in the medial prefrontal cortex, as well as post-encoding coupling between the lateral occipital cortex and posterior hippocampus, are related to faster reaction times for primed objects of overlapping sequences compared to their control condition, more work is needed to verify that the observed findings indeed reflect consolidation dependent integration as proposed.

    1. eLife assessment

      This important study uses convincing time-resolved proximity proteomics, validated with proximity ligation assays, to provide new insight into mechanical regulation of caveolin-1 complexes that form in migrating cells. Solid follow up experiments reveal a reciprocal relationship between mechanosensitive caveolae and RhoGTPase signalling in migrating cells, but evidence supporting a direct link between the newly identified factors with a specific caveolae subpopulation remains incomplete at this stage.

    2. Reviewer #1 (Public Review):

      In this study, Girardello et al. use proteomics to reveal the membrane tension sensitive caveolin-1 interactome in migrating cells. The authors use EM and surface rendering to demonstrate that caveolae formed at the rear of migrating cells are complex membrane-linked multilobed structures, and they devise a robust strategy to identify caveolin-1 associated proteins using APEX2-mediated proximity biotinylation. This important dataset is further validated using proximity ligation assays to confirm key interactions, and follows up with an interrogation of a surprising relationship between caveolae and RhoGTPase signalling, where caveolin-1 recruits ROCK1 under high membrane tension conditions, and ROCK1 activity is required to reform caveolae upon reversion to isotonic solution. However, caveolin-1 recruits the RhoA inactivator ARHGAP29 when membrane tension is low and ARHGAP29 overexpression leads to disassembly of caveolae and reduced cell motility. This study builds on previous findings linking caveolae to positive feedback regulation of RhoA signalling, and provides further evidence that caveolae serve to drive rear retraction in migration but also possess an intrinsic brake to limit RhoA activation, leading the authors to suggest that cycles of caveolae assembly and disassembly could thereby be central to establish a stable cell rear for persistent cell migration

      A major strength of the manuscript is the robust proteomic dataset. The experimental set up is well defined and mostly well controlled, and there is good internal validation in that the high abundance of core caveolar proteins in low membrane tension (isotonic) conditions, and absence under high membrane tension (brief hypo-osmotic shock) conditions, correlating very well with previous finding. The data could however be better presented to show where statically robust changes occur, and supplementary information should include a table of showing abundance. It's very good to see a link to PRIDE, providing a useful resource for the community.

      The authors detail several known interactions and their mechanosensitivty, but also report new interactors of caveolin-1. Several mechanosensitive interactions of caveolin-1 take place at the cell rear, but others are more diffuse across the cell looking at the PLA data (e.g FLN1, CTTN, HSPB1; Figure 4A-F and Figure 4 supplement 1). It is interesting to speculate that those at the cell rear are involved in caveolae, whilst others are linked specifically to caveolin-1 (e.g. dolines). PLA or localisation analysis with Cavin1/PTRF may be able to resolve this and further specify caveolae versus non-caveolae mechanosensitive interactions.

      The Cav1/ARHGAP29 influence on YAP signalling is interesting, but appear to be quite isolated from the rest of the manuscript. Does overexpression of ARHGAP29 influence YAP signalling and/or caveolar protein expression/Cav1pY14?<br /> ARHGAP29 and RhoA/ROCK1 related observations are very interesting and potentially really important. However, the link between ARHGAP29 and caveolae is not well established (other than in proteomic data). PLA or FRET could help establish this.<br /> The relationship between ARHGAP29 and RhoA signalling is not well defined. Is GAP activity important in determining the effect on migration and caveolae formation? What is the effect on RhoA activity? Alternatively, the authors could investigate YAP dependent transcriptional regulation downstream of overexpression.

    3. Reviewer #2 (Public Review):

      Girardello et al investigated the composition of the molecular machinery of caveolae governing their mechano-regulation in migrating cells. Using live cell imaging and RPE1 cells, the authors provide a spatio-temporal analysis of cavin-3 distribution during cell migration and reveal that caveolae are preferentially localized at the rear of the cell in a stable manner. They further characterize these structures using electron tomography and reveal an organization into clusters connected to the cell surface. By performing a proteomic approach, they address the interactome of caveolin-1 proteins upon mechanical stimulation by exposing RPE1 cells to hypo-osmotic shock (which aims to increase cell membrane tension) or not as a control condition. The authors identify over 300 proteins, notably proteins related to actin cytoskeleton and cell adhesion. These results were further validated in cellulo by interrogating protein-protein interactions using proximity ligation assays and hypo-osmotic shock. These experiments confirmed previous data showing that high membrane tension induces caveolae disassembly in a reversible manner. Eventually, based on literature and on the results collected by the proteomic analysis, authors investigated more deeply the molecular signaling pathway controlling caveolae assembly upon mechanical stimuli. First, they confirm the targeting of ROCK1 with Caveolin-1 and the implication of the kinase activity for caveolae formation (at the rear of the cell). Then, they show that RhoGA ARHGAP29, a factor newly identified by the proteomic analysis, is also implicated in caveolae mechano-regulation likely through YAP protein and found that overexpression of RHoGA ARHGAP29 affects cell motility. Overall, this paper interrogated the role of membrane tension in caveolae located at the rear of the cell and identified a new pathway controlling cell motility.

      Strengths:

      Using a proximity-based proteomic assay, the authors reveal the protein network interacting with caveolae upon mechanical stimuli. This approach is elegant and allows to identify a substantial new set of factors involved in the mechano-regulation of caveolin-1, some of which have been verified directly in the cell by PLA. This study provides a compelling set of data on the interactions between caveolae and its cortical network which was so far ill-characterized.

      Weaknesses:

      The methodology demonstrating an impact of membrane tension is not precise enough to directly assess a direct role on caveolae at a subcellular scale, that is between the front and the rear of the cell. First, a better characterization of the "front-rear" cellular model is encouraged. Secondly, authors frequently present osmotic shock as "high membrane tension" stimuli. While osmotic shock is widely used in the field, this study is focused only on caveolae localized at the rear of cell and it remains unclear how the level of a global mechanical stimuli triggered by an osmotic shock could mimic a local stimuli. In the present case, it remains unknown the extent to which this mechanical stress is physiologically relevant to mimic mechanical forces applied at the rear of a migrating cell.<br /> Some images are not satisfying to fully support the conclusions of the article. At this stage, the lack of an unbiased quantitative analysis of the spatio-temporal analysis of caveolae upon well-defined mechanical stimuli is also needed. Cells on images, in particular Figure 1, are difficult to see. Signal-to noise ratio in different cell area could generate a biased. Since there is inconsistency between caveolae density and localization between Figures, more solid illustrations are needed along quantitative analysis.

    1. eLife assessment

      The authors present a useful analysis of the phenotype of sheep in which the muscle developmental regulator myostatin has been mutated in a FGF5 knockout background. The goal was to produce sheep with a "double-muscled" phenotype, yet the genetically engineered sheep exhibited meat with a smaller cross-sectional area and higher number of muscle fibers. The work extends the extensive body of knowledge already published in this area. The authors provide evidence using in vitro experiments that Fosl1 regulates myogenesis, but the strength of evidence relating to the muscle phenotype and underlying cellular and molecular mechanism remains incomplete.

    2. Reviewer #3 (Public Review):

      Although the authors findings are interesting, they do little to demonstrate new scientific information or advancements in producing genetically modified livestock with improved production characteristics. While the MSTNDel273 sheep exhibited an increased number of muscle fibers, the data provided did not demonstrate a significant improvement in meat production, quality or quantity in the MSTNDel273 sheep vs WT.

      The manuscript is very long, complicated and difficult to read, given the minimum amount of significant information that is provided. It reads more like a graduate student thesis than a scientific manuscript ready for publication. Given the significant findings are so minimal, the amount of text provided, figures and tables are excessive. A large number of different molecular techniques are employed to try and decipher the mechanism(s) that result in the observed phenotype = double muscling. The authors focus on the MEK-ERK-FOSL1 pathway and suggest this is the key pathway/mechanism resulting in the phenotype observed in MSTNDel273sheep. However, they provide very little "significant" evidence to support this. RNA-Seq data demonstrated that hundreds of different genes were either upregulated or down-regulated, but the authors chose to only focus on FOSL1 and associated genes. The findings do not support the idea that FOSL1 is not involved, but neither do they strongly support FOSL1 involvement. The observations made by the authors could be co-incidental and not causative in nature.

      The authors indicate that sgRNA design changes in addition to changing the molar ratio of Cas9MRNA:sgRNA improved the ability to generate biallelic homozygous mutant sheep; however, the data provided to not demonstrate any significant difference. Given the small number of sheep that were actually produced and evaluated, it is extremely difficult to demonstrate anything that was analyzed to be significantly (statistically) different between MSTNDel273 sheep and WT, yet the authors seem to ignore this in much of their discussion. There is no explanation as to why the authors started with sheep that were FGF5 knockouts. The reviewer assumes that this was simply a line of sheep available from previous studies and the goal was to produce sheep with both improved hair/wool characteristics in addition to improved muscle development. However, the use of FGF5 knockout sheep complicates the ability to accurately decipher the unique aspects associated with targeting only myostatin for knock-out. At minimum, this is a variable that has to be considered in the statistical analysis. No information is provided on the methods used to produce the MSTNDel273 sheep, which seems fundamentally important. It is assumed they were produced by injecting one-cell zygotes then transferring these into surrogate females, but given the information provided, it is impossible to know. Certainly, the methods employed could have a profound effect on the outcome. There is no information provided on the sex of the animals produced and then analyzed.

      Comments on revised version:

      The manuscript by Chen et al. is improved and demonstrates successful gene editing in sheep embryos to obtain biallelic mutation of Mstn and FGF5. Despite the improvements in the revised manuscript, the cellular and molecular mechanism remain inadequate to conclude whether Fosl1 indeed acts downstream of myostatin. In addition, there is little that is new direction versus confirmatory for what is already well know regarding Mstn and FGF5

      There are also a number of editorial mistakes e.g. the authors refer to tables S1-S4 in the materials and methods and results section, but there is no table S1-S4 provided.

    3. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The authors present a useful analysis of the phenotype of sheep in which the muscle developmental regulator myostatin has been mutated in a FGF5 knockout background. The goal was to produce sheep with a "double-muscled" phenotype, yet the genetically engineered sheep exhibited meat with a smaller cross-sectional area and higher number of muscle fibers. The work extends the extensive body of knowledge already published in this area. The authors provide evidence using in vitro experiments that Fosl1 regulates myogenesis, but the strength of evidence relating to the muscle phenotype and underlying cellular and molecular mechanism is inadequate.

      Thanks for assessment. According to the reviewers' comments, we have supplemented and updated the data on muscle phenotypes, and the molecular mechanisms also have been supplemented accordingly, such as FOSL1 silencing and inhibition, as as well as possible secondary fusion of myoblasts regulated by calcium signaling. Meanwhile, considering the suggestions of editors and reviewers, we have also supplemented the data on serum MSTN regulation. Given that the phenotype of MSTN gene editing is mutation site dependent, we directly cultured skeletal muscle satellite cells using serum from WT and MF+/- sheep, and showed that the serum regulation cannot be ignored after MSTN_Del273C mutation with _FGF5 knockout.

      Public Review:

      Chen and collaborators first analysed in sheep embryonic gene editing using CRISPR-Cas9 technology to invalidate the two alleles of Mstn and Fgf5 genes by using different ratios of Cas9 mRNA and sgRNA. They showed that a ratio of 1:10 had highest efficiency and they successfully generated two sheep with biallelic mutations of both genes. Materials and Methods on the generation of gened edited sheep is entirely missing. The data on these gene edited sheep have been already published twice by the authors in different contexts. Other groups reported on gene editing of Mstn or Fgf5 in sheep embryos and the resulting phenotypes.

      We thank the reviewers for pointing out our negligence and shortcomings. We have provided detailed information on the generation method of gene editing sheep in the Materials and Methods. Briefly, gene-edited sheep were produced by injecting MSTN sgRNA, FGF5 sgRNA, and Cas9 mRNA into embryos in different ratio.

      Although the findings are interesting, they do not provide sufficiently new scientific information or advancements in producing genetically modified livestock with improved production characteristics. While the MSTNDel273 sheep exhibited an increased number of muscle fibers, the data provided did not demonstrate a significant improvement in meat productions, quality or quantity in the MSTNDel273 sheep vs WT.

      Thank you very much for your constructive comments. Considering the lack of data on improving production traits, we have further supplemented the data on meat yield and quality of MSTN_Del273C mutation with _FGF5 knockout sheep in Table S6-10. Although these improvements were not significant enough, our data showed increased meat production traits in MSTN_Del273C mutation with _FGF5 knockout sheep, such as the proportion of hind leg meat to carcass and the proportion of gluteus medius to carcass. For example, the proportion of hind leg meat was significantly increased by 21.2% (Table S7), and the proportion of gluteus medius in the carcass of MF+/- sheep was significantly (P<0.01) increased by 26.3% compared to WT sheep (Figure 2K). In addition, there were no significant (P>0.05) differences in pH, color, drip loss, cooking loss, shearing force, and amino acid content of the longissimus dorsi between WT and MF+/- sheep (Table S8-10). All these results demonstrated that the MSTN_Del273C mutation with _FGF5 knockout sheep had well-developed hip muscles with smaller muscle fibers, which do not affect meat quality, and this phenotype may be dominated by MSTN gene.

      The authors indicate that sgRNA design changes in addition to changing the molar ratio of Cas9MRNA:sgRNA improved the ability to generate biallelic homozygous mutant sheep; however, the data provided to not demonstrate any significant difference. Given the small number of sheep that were actually produced and evaluated,it is extremely difficult to demonstrate anything that was analyzed to be significantly (statistically) different between MSTNDel273 sheep and WT, yet the authors seem to ignore this in much of their discussion. There is no explanation as to why the authors started with sheep that were FGF5 knockouts. The reviewer assumes that this was simply a line of sheep available from previous studies and the goal was to produce sheep with both improved hair/wool characteristics in addition to improved muscle development. However, the use of FGF5 knockout sheep complicates the ability to accurately decipher the unique aspects associated with targeting only myostatin for knock-out. At minimum, this is a variable that has to be considered in the statistical analysis. No information is provided on the methods used to produce the MSTNDel273 sheep, which is fundamentally important. It is assumed they were produced by injecting one-cell zygotes then transferring these into surrogate females. The methods employed might have a profound effect on the outcome.

      We greatly appreciate your review. In the current study, we did not discuss the impact of changes in sgRNA design on the ability to generate biallelic homozygous mutant sheep. In fact, we focused on the delivery molar ratio of Cas9 mRNA to sgRNA and found that increasing the molar ratio of Cas9:sgRNA can improve the ability to produce homozygous biallelic mutations in sheep. We apologize for neglecting this statistical analysis, which was tested for significance of differences in the revised version by the chi-square test. Other restrictions related to the actual production and evaluation of the number of sheep were analyzed in our additional discussion. It should be explained to the reviewers that the gene-edited sheep we produced did not start with FGF5 knockout sheep. As hypothesized by the reviewers, we used a one-step method to simultaneously edit the two genes of MSTN and FGF5 to concomitantly increase muscle yield and improve wool characteristics in sheep, which resulted in knockout of the FGF5 gene and mutation of the MSTN gene. As speculated by the reviewers, the MSTN_Del273C mutation with _FGF5 knockout sheep was generated by injecting sgRNA and Cas9 mRNA of MSTN and FGF5 into a single fertilized egg and then transplanted into a surrogate mother. We have provided detailed information on the generation method of gene edited sheep in the Materials and Methods section.

      Authors genotyped one sheep with a biallelic three base pair deletion in Mstn exon 3 and a compound heterozygote mutation in Fgf5 with a 5 nucleotides deletion on one allele and 37 nucleotides deletion on the other allele, partially spanning over the same region. This sheep developed a double muscle phenotype, which was documented using photography and CT scan. The hair phenotype was not further addressed, but authors referred to a previous publication.

      Thank you for your review. In the current study, we only focused our perspective on the muscle phenotype, while the data on the hair phenotype involved another study. Therefore, we referred to our previous publication on hair phenotypes, in which the mutation locus in FGF5 gene-edited sheep is the same as in the current study.

      Authors performed morphometric studies on two distinct muscles, longissimus dorsi and gluteus medius, and found a profound fiber hypotrophy in the Mstn-/-;Fgf5-/- double mutants, with a shift from larger fiber diameter to smaller fiber sizes. Morphometric studies showed only a low percentage of fibers in wt and mutant sheep had fiber cross sectional areas larger than 800 µm2, whereas about 30% in wt and about 60% in the mutant had CSA of <400 µm2. The report of one case, without reproducing the phenotype in other sheep, is scientifically insufficient. The fiber sizes in wt sheep remains far below previously published reports in sheep (about 3-5 times smaller) and as compared to other species, which suggests a methodological error in morphometric methods.

      We greatly appreciate your careful review. There is indeed an error in morphological analysis of the MF-/- sheep longissimus dorsi and gluteus medius muscles. After carefully checked, we found that the reason for the fiber sizes in WT sheep remains far below previously published reports in sheep was due to the incorrect use of scale. Thus, we re-scanned the tissue sections and re-calculate the cross-sectional area of muscle fibers and the number of muscle fiber cells per unit area with the correct scale. In this case, the average cross-sectional area of muscle fibers in WT sheep was approximately 1800 μm2, which is consistent with the previous report. We once again salute the reviewing expert for such a careful and conscientious review. Considering the profound fiber hypotrophy in MSTN_Del273C mutation with _FGF5 knockout sheep as pointed out by the reviewer, we performed a statistical analysis on the proportion of centrally nucleated myofibres between WT and MF+/- sheep, which can characterize the occurrence of muscle fiber hypotrophy. The results showed that there was no significant difference in the proportion of centrally nucleated myofibres between WT and MF+/- sheep (Figure S2D). At the same time, we also analyzed the mRNA expression levels of muscle fiber hypotrophy and muscle atrophy related genes, such as MTM1, DMD, IGF1, SMN1, and GAA. Although the levels of MTM1, IGF1, SMN1, and GAA were significantly increased (Figure S2E), this elevation did not result in the occurrence of muscle fiber hypotrophy and muscle atrophy, but was beneficial for muscle formation. Therefore, we suggest that the phenomenon produced by MSTN_Del273C mutation with _FGF5 knockout may not be muscle fiber hypotrophy. Because MSTN_Del273C mutation with _FGF5 knockout significantly promotes the proliferation of sheep skeletal muscle satellite cells (Figure 3A-F), and more importantly, its muscle phenotype in MF-/- and MF+/- sheep were improved, including the "double-muscle" phenotype of the rump (Figure 2A), the proportion of gluteus medius in the carcass (Figure 2K), and the proportion of hind leg meat (Table S7).

      The authors also investigated the influence of Fgf5 mutation on muscle development. They determined fiber cross sectional area in heterozygous Fgf5 mutant (number of investigated animals not given) and conclude that Mstn mutation but not Fgf5 mutation caused the double muscle phenotype. Results are insufficient to support this conclusion. Firstly, authors investigated heterozygous FGF5 sheep and not homozygous mutants. Secondly, FGF5 has previously been shown to stimulate expansion of connective tissue fibroblasts and to inhibit skeletal muscle development during limb embryonic development (Clase et al. 2000). Of note, Mstn is also expressed during embryonic development. A combined knockout could therefore entail synergistic effects and cause muscle hyperplasia that is not found in individual knockout, a hypothesis that was not addressed by the authors.

      Thank you very much for your critical review, which is very valuable for improving the quality of our manuscript. We have given the number of animals studied in all figure legends. Given the lack of MSTN and FGF5 single gene edited sheep, both homozygous and heterozygous sheep, especially MSTN single gene edited sheep, we have weakened the view that MSTN mutations rather than FGF5 mutations lead to “double-muscle” phenotype in conclusion and discussion. As you have mentioned, our current data is indeed insufficient to support this conclusion. In addition, considering the expression of MSTN and FGF5 in embryonic development and their regulation of skeletal muscle development, we examined the expression of MSTN and FGF5 in individual development after MSTN_Del273C mutation with _FGF5 knockout (Figure S2A). However, these results are limited by the animals involved in embryonic development, especially single gene edited embryos. We greatly appreciate your very meaningful and valuable comments on the possible synergistic effects of combined knockdown. We will prepare MSTN and FGF5 single gene edited sheep to further explore possible synergistic effects in the following study.

      The authors generated and studied an F1 generation of mutant sheep with heterozyogous mutation in Mstn and Fgf5. In Mstn+/-;Fgf5+/-, gluteus medius muscle was found to be larger compared to wt sheep, whereas other muscles were smaller, and overall meat quantity did not change. Morphometric studies revealed a similar muscle fiber hypotrophy and muscle hyperplasia as in the Mstn-/-;Fgf5-/- gluteus muscle.

      Thank you for your comments. We found that the proportion of gluteus medius in MF+/- sheep was larger than that in WT sheep, and in addition, the proportion of hind leg meat also significantly increased (Table S7). Morphological analysis shows that MF+/- sheep exhibited a myofiber hyperplasia phenotype similar to MF-/- sheep.

      In the next part of results, authors investigated the presence of myostatin protein in homozygous Mstn muscle using immunohistochemistry and found no differences compared to wt, however, positive and negative controls are missing. The also determined Mstn transcription and protein quantity using WB in heterozygous Mstn muscle and found no difference. The authors did not provide data to explain of why the herein generated Mstn mutation causes muscle fiber hypotrophy, whereas most work on myostatin abrogation demonstrated fiber hypertrophy.

      Thank you very much for your constructive comments. Due to the lack of necessary positive and negative controls in immunohistochemistry study, we decided to delete the data on immunohistochemistry in the manuscript to further streamline it. In the current study, although mutations in MSTN lead to a decrease in the cross-sectional area of individual fibers, the number of muscle fibers per unit area were increased, and the final result was an increase in muscle volume and a “double-muscle” phenotype, as well as an increase in the proportion of gluteus medius to carcass (Figure 2K) and the proportion of hind leg meat (Table S7). Importantly, there was no significant difference in the proportion of centrally nucleated myofibres between WT and MF+/- sheep (Figure S2D), and the elevated expression levels of muscle fiber hypotrophy and muscle atrophy marker genes MTM1, IGF1, SMN1, and GAA are more beneficial for muscle health. Therefore, we support that this is not a muscle fiber hypotrophy. As for the phenotype of muscle fiber hypertrophy demonstrated by most myostatin abrogation studies, we analyzed the possible reasons in the discussion, that is, the effect of MSTN mutation on muscle fiber phenotype may be mutant site-dependent.

      Authors then isolated myoblasts from hind limbs of 3-month-old sheep fetuses and cultured in presence of 20% fetal bovine serum before switching to differentiation medium containing 2% horse serum. The cultures showed increased proliferation of Mstn+/-;Fgf5+/- myoblasts as well as downregulation of genes associated with muscle differentiation as well as reduced fusion index. No experiments were performed to assure whether the myostatin and FGF5 pathways were inhibited. No control experiments using supplementation with recombinant proteins and using growth factor depleted culture supplements were performed. As FGF5 and myostatin are secreted factors, evidence is missing whether this led to conditioning of the culture medium. Of note, previous work in mice demonstrated that the double muscle phenotype developed independent of satellite cells activity (Amthor et al. 2009).

      We greatly appreciate your valuable suggestions. In addition to detecting the MSTN pathway at the cellular level, we also assayed the expression of MSTN receptors and downstream Smad and Jun families in the gluteus medius, and found that MSTN_Del273C mutation with _FGF5 knockout led to upregulation of two receptors, while the expression of downstream Smad and Jun families was also inhibited to varying degrees (Figure S4A). Considering the possible serum regulation, we also supplemented the data on serum MSTN regulation. Given that the phenotype of MSTN gene editing is mutation site dependent, we directly cultured skeletal muscle satellite cells using serum from WT and MF+/- sheep. We found that serum from MF+/- sheep promoted the proliferation of skeletal muscle satellite cells (Figure S4D). MSTN_Del273C mutation with _FGF5 knockout promoted FOSL1 expression using WT sheep serum (Figure S4E), which was similar to the results of FBS culture and HS induction. The serum from MF+/- sheep strongly stimulated FOSL1 expression and the inhibition of MyoD1 (Figure S4F). These results indicate that serum regulation cannot be ignored after MSTN_Del273C mutation with _FGF5 knockout.

      Authors then performed RNA seq from Mstn+/-;Fgf5+/- muscle and found a number of differentially expressed genes, but none has been previously reported being involved in the myostatin signaling pathway, so the authors chose to only focus on FOSL1 and associated genes. Authors then demonstrated that Pdpn and Ankrd2 were upregulated during myogenic differentiation, whereas FOPSL1 was downregulated. Moreover, Fosl1 transcription was upregulated in myoblasts and myotubes from Mstn+/-;Fgf5+/- muscle. Authors showed an interaction between Fosl1 and Myod1. Moreover, authors demonstrated that Polsl1 directly binds to the Myod1 promoter. Authors also found decreased p38 MARPK protein levels in proliferating myoblasts from Mstn+/-;Fgf5+/- muscle and increased p38 MARPK in differentiating myotubes.

      In the revised version, we have streamlined this section by removing content such as PDPN, AKNRD2, and p38 MAPK, aiming to focus on the MEK-ERK-FOSL1 axis. Meanwhile, we further confirmed the regulatory effect of FOSL1 on MyoD1 by dual luciferase assay.

      Furthermore, gain-of-function by overexpressing FOSL1 promoted cell proliferation and inhibited differentiation, and tert-butylhydroquinone, an indirect activator of FOSL1 also inhibited myogenic differentiation. The findings do not support the idea that FOSL1 is not involved, but neither do they strongly support the involvement of FOSL1. The observations made by the authors could be co-incidental and not causative in nature.

      We greatly appreciate the valuable suggestions provided by the reviewers, which are of great significance for improving our manuscript. Considering the reviewers’ suggestions, we supplemented the FOSL1 loss-of-function experiments and found that interfering with FOSL1 can inhibit the proliferation and promote differentiation of skeletal muscle satellite cells, which is contrary to the results of overexpression of FOSL1 (Figure 6). Meanwhile, we also used the inhibitor PB98059 to inhibit the ERK pathway to indirectly inhibit the activity of FOSL1, and the results showed that inhibition of FOSL1 activity also promoted myogenic differentiation (Figure 7F-G). These results could further support the important role of FOSL1.

      The manuscript by Chen et al. demonstrated successful gene editing in sheep embryos to obtain biallelic mutation of Mstn and FGF5. The resulting double muscle phenotype resulted from fiber hypotrophy and hyperplasia, which contradicts findings in the literature. Chen et al. generated F1 heterozygous offsprings, in which Mstn transcription and translation did not change. Myoblasts from these animals showed increased proliferation and decreased differentiation, which authors interpreted as the underlying cellular mechanism of the double muscle phenotype. However, no work on muscle development in these animals is presented. Important in vitro control experiments are missing. Chen and collaborators found Fosl1 as a differentially expressed gene in Mstn+/-;Fgf5+/- muscle. Fosl1 drives myoblast proliferation and has direct regulatory effect on the Myod1 promoter. The cellular and molecular mechanism of Fosl1 during myogenesis is novel and solid evidence. However, data remain inadequate to conclude whether Fosl1 indeed acts downstream of myostatin.

      We greatly appreciate the reviewers for their insightful insights and very constructive suggestions, which were very helpful for further improving our data. In our study, although the mutation in MSTN resulted in a decrease in the cross-sectional area of individual muscle fibers, the number of muscle fibers per unit area increased, which ultimately resulted in an increase in muscle size and the development of a "double-muscle" phenotype. Therefore, we support that this is not a manifestation of muscle fiber dystrophy, and the detection of some marker genes for muscle fiber dystrophy and the proportion of central nucleus of muscle fibers also support this hypothesis (Figure S2E-F). In addition, the results such as a reduced cross-sectional area of per muscle fibers in our findings contradict the literature on muscle fiber hypertrophy, which may be due to phenotypic differences caused by mutations at different sites of MSTN, and perhaps may also be species-related. For example, the Belgian blue cattle with a natural mutation in the MSTN gene have an increased number of myofibers and a reduced myofiber cross-sectional area [1], and knockdown of the MSTN gene leads to an increase in the cross-sectional area of muscle fibers in mice, without affecting the number of muscle fibers [2,3], as we further described this in discussion. It should be noted that the possible complementary regulation of FGF5 cannot be ruled out either, but unfortunately, this makes the problem extraordinarily complex. We plan to produce single mutant sheep with segregation of the MSTN and FGF5 genes in subsequent studies and give full consideration to the current problem. Regarding the muscle development of gene-edited animals, due to the limitations of large animal conditions and limited editing individuals, we have not comprehensively evaluated the process of muscle development in vivo to further improve the potential cellular mechanisms of muscle phenotype, except for evaluating the expression of MSTN and FGF5 at the age of 3 months of individual development and the expression of MSTN at 12 months of age (Figure S2A). To determine whether FOSL1 indeed acts downstream of MSTN, we supplemented the expression levels of FOSL1 under serum regulation to support our conclusions (Figure S4D-F).

      [1] Wegner J, Albrecht E, Fiedler I, Teuscher F, Papstein HJ, Ender K. Growth- and breed-related changes of muscle fiber characteristics in cattle[J]. Journal of Animal Science, 2000,78:1485-1496.

      [2] Nishi M, Yasue A, Nishimatu S, Nohno T, Yamaoka T, Itakura M, Moriyama K, Ohuchi H, Noji S. A missense mutant myostatin causes hyperplasia without hypertrophy in the mouse muscle[J]. Biochemical and Biophysical Research Communications, 2002,293:247-251.

      [3] Zhu X, Hadhazy M, Wehling M, Tidball JG, McNally EM. Dominant negative myostatin produces hypertrophy without hyperplasia in muscle[J]. FEBS Letters, 2000,474:71-75.

      As the significant findings are minimal, the amount of text provided, figures and tables are disproportionally excessive. A large number of different molecular techniques are employed to try and decipher the mechanism(s) that result in the observed phenotype = double muscling. The authors focus on the MEK-ERK-FOSL1 pathway an suggest this the key pathway/mechanism resulting in the phenotype observed in MSTNDel273sheep. However, they provide very little solid evidence to support this notion.

      Thank you for your review. We have substantially streamlined the manuscript, removed some irrelevant information, and provided all unnecessary figures and tables as supplementary information. Meanwhile, we have added new data to further support that _MSTN_DelC273 mutation generates a muscle phenotype through the MEK-ERK-FOSL1 pathway.

      The manuscript is very long, complicated and difficult to read, given the minimum amount of significant information that is provided. It requires major rewriting to be published. Further, it misses information in material methods, on the generation of animals, on histological techniques and morphometric studies. There is no information provided on the sex of the animals produced and then analyzed. There are also a number of editorial mistakes e.g. the authors refer to tables S1-S4 in the materials and methods and results section, but and there is no table S1-S4 provided.

      Thank you for your review. We have greatly streamlined and significantly revised the manuscript. At the same time, we have supplemented detailed information on animal generation, histologic and morphological studies in materials and methods, as well as the information on gene-edited animal production, including gender, age, and so on. Finally, we reviewed the entire manuscript and updated any possible omissions or negligence, such as those oversights like tables S1-S4.

      Recommendations for the authors:

      Suggestions to improve the paper (see also public review):

      - Include the method part of generating the gene edited animals.

      We thank the editor and reviewers for pointing out our negligence. We have provided detailed information on the generation method of gene-edited sheep in Materials and Methods, which was produced by injecting MSTN sgRNA, FGF5 sgRNA, and Cas9 mRNA into embryos in different ratios.

      - Increase number of Mstn-/-;Fgf5-/- experimental animals allowing for acquisition of statistically relevant data. This is very important as the muscle phenotype of the F1 generation is not obvious. Authors should provide data that the Mstn mutation indeed invalidates myostatin signaling. They should provide data on myostatin protein Mstn transcription as well on myostatin target genes in Mstn-/-;Fgf5-/- sheep.

      Many thanks to the eidtor and reviewers for their constructive suggestions. The strategy of using MF-/- sheep to validate the transcription and target gene data of myostatin is indeed the best. However, we only generated one MF-/- sheep, which seriously limits the implementation of such an optimal strategy and may also make statistical analysis based on MF-/- sheep unreliable. Considering these factors, our current study mainly focuses on heterozygous MF+/- sheep. We are planning to generate single gene homozygous mutant sheep for MSTN and FGF5 gene separation in subsequent studies and to give full consideration to the current issue.

      - They should also provide data on myostatin target genes in muscles from heterozygous animals.

      Thank you for your very informative suggestions. We have quantitatively detected the mRNA expression levels of the receptors and downstream target genes of MSTN in the gluteus medius of heterozygous MF+/- sheep. Compared with WT sheep, the mRNA expression levels of type I receptor (ACVR1) and type II receptor (ACVR2A, ACVR2B) were highly significantly increased in the muscle of MF+/- sheep (Figure S4A), there was no significant change in mRNA expression levels in the Smand family (Figure S4B), whereas the mRNA expression levels of JunB of Jun family, a downstream target gene of MSTN, were significantly down regulated (Figure S4C). These results suggest that the effect of MSTN_Del273C with _FGF5 knockout may not be limited to MEK-ERK-FOSL1. Again, we would like to thank the editor and reviewers for their constructive suggestions, which provide a new direction for us to further deepen our insight into the mutations of MSTN gene.

      - The morphometric results on fiber CSA seem wrong. By looking at the fiber sizes and size bar in Figure 2 H would bring to far higher estimated CSA. There must be a systematic error in using the morphometric algorithm.

      Thank you very much for your careful review. There were indeed some errors in morphological analysis of the MF-/- sheep longissimus dorsi and gluteus medius. After checking, we found that the reason why the muscle fiber size was much lower than the data in the previously published sheep report was due to the incorrect use of scale bar. To this end, we re-scanned the tissue slices and used the correct scale bar to re-counted the cross-sectional area of muscle fibers and the number of muscle fiber cells per unit area. In this case, the average cross sectional area of muscle fibers in WT sheep was similar to the previous report.

      - The labeling of the ordinate of Fig. 2I is not readable (x1000 µm2, or x100 µm2?). Authors should make sure that they look at the same muscle part, as fiber sizes can highly vary depending on exact anatomical situation. In small laboratory animals, entire muscle cross sections are usually analyzed to prevent such bias. This may proof difficult in large animals, however, small muscles could easily be identified and cross sections of entire muscles be analyzed. As myostatin KO concerns all skeletal muscles, authors could consider muscle such as FDB or extraocular muscles.

      Thank you for your careful review and suggestions. The vertical axis of Figure 2I is in the units of ×1000 μm2, and each data point represents the actual measured area of each muscle fiber. Because there are significant differences in muscle fiber size, we visualized the measurement values of all individual muscle fiber areas, and the average value of the scatter plot was used as the average area of all muscle fibers. We did this to provide a more intuitively display the distribution of muscle fiber size.

      - The material of methods of muscle histology and morphometric studies must be included.

      Thank you for your suggestions. We have supplemented the methods of muscle histology and morphology study, as well as statistical methods for cross-sectional area and quantity of muscle fibers in the material methods.

      - In figures, numbers of experimental animals be given throughout, as well as number of technical repeats. The authors need to provide some minimal data on how the genetically engineered sheep were produced, in addition to how many, the sex etc.....and which of these were analyzed to obtain the data. It is impossible to know when reading this manuscript whether data involving, for example gene seq, westerns, microscopic images etc involves one sheep or some compilation of data.

      Thank you very much for your constructive suggestions, which is of great guiding significance for improving the quality of our manuscript. We have clearly stated the number of experimental animals and the number of any biological replicates in all figure legends. Meanwhile, we have provided detailed information on the generation method of gene edited sheep in the Materials and Methods, which was produced by injecting MSTN sgRNA, FGF5 sgRNA, and Cas9 mRNA into embryos in different ratios.

      - As authors work on Mstn;Fgf5 double KO animals, they should explore whether Fgf5 is expressed in developing sheep muscle, and whether combined KO entails a synergistic effect on muscle development.

      We detected the expression of FGF5 in muscle tissue of WT and MF+/- sheep at 3 months of age of individual development, which was significantly reduced compared to WT sheep (Figure S2A). We greatly appreciate your very meaningful and valuable comments on the possible synergistic effects of combined knockdown. Due to the limitations of single gene knockout of MSTN and FGF5 in sheep in our current study, especially their homozygous mutants. We will prepare MSTN and FGF5 single gene edited sheep to further explore possible synergistic effects in the following study.

      - The authors should address the question of why their mstn mutation causes fiber hypotrophy, whereas most other work reported the opposite. Why would herein generated mutation act differently? Does mutated myostatin gain a different biological effect? Does it bind to different receptors?

      Thank you very much for your valuable comment. Regarding the possibility of muscle fiber dystrophy in MSTN_Del273C mutation with _FGF5 knockout sheep, we have performed a statistical analysis of the proportion of central nucleus of muscle fibers in MF+/- sheep, which can characterize the occurrence of muscle dystrophy in some extent. The results showed that there was no significant difference in the proportion of central nucleus of muscle fibers between WT and MF+/- sheep (Figure S2E). At the same time, we also analyzed the mRNA expression levels of genes MTM1, DMD, IGF1, SMN1, and GAA related to muscle fiber dystrophy and muscle atrophy. Although the levels of MTM1, IGF1, SMN1, and GAA were significantly increased (Figure S2F), this elevation did not lead to the occurrence of muscle fiber dystrophy and muscle atrophy, but instead, it was beneficial for muscle formation. Therefore, we suggested that this phenomenon produced by MSTN_Del273C mutation with _FGF5 knockout may not be muscle fiber dystrophy, as MSTN_Del273C mutation with _FGF5 knockout significantly promoted the proliferation of sheep skeletal muscle satellite cells (Figure 3A-F). More importantly, MSTN_Del273C mutation with _FGF5 knockout improves the muscle phenotype of sheep, including the "double-muscle" phenotype of the rump (Figure 2A), the proportion of gluteus medius to the carcass (Figure 2K), and the proportion of hind leg meat (Table S7). In addition, we analyzed in discussion why the current mutation produces a phenotype different from other work reports, which we suggested that this may be due to different mutation sites. We provided a detailed analysis of this in discussion. It is indeed a very thought-provoking question about whether mutated myostatin acquire different biological effects and whether they bind to different receptors, which we plan to further reveal this in the homozygous MSTN and FGF5 mutant sheep.

      - Concerning the in vitro work, authors need to demonstrate whether Mstn and/or FGF5 signaling pathways are altered in myoblasts/myotubes. As both are secreted factors, authors need to show that serum conditioning is changing in myoblast cultures. Authors should perform cultures in which these factors are entirely suppressed and thus signaling pathway shut down. They could use growth factor depleted supplements and/or add myostatin and FGF5 inhibitors to the serum. The need to determine first the individual effect of myostatin and FGF5 and then challenge the combined effect. They also should perform the inverse experiment and supplement cultures with recombinant factors, both as individual approach and combined approach.

      We greatly appreciate your valuable suggestions. In addition to detecting the MSTN pathway at the cellular level, we also assayed the expression of MSTN receptors and downstream Smad and Jun families in the gluteus medius, and found that MSTN_Del273C mutation with _FGF5 knockout led to upregulation of two receptors, while the expression of downstream Smad and Jun families was also inhibited to varying degrees (Figure S4A). Considering the possible serum regulation, we also supplemented the data on serum MSTN regulation. Because we have previously tested inhibitors of MSTN and FGF5, but did not observe any effect, we suggest this may be due to the nonspecificity of the inhibitors, as there are no sheep specific MSTN and FGF5 inhibitors. Given that the phenotype of MSTN gene editing is mutation site dependent, we directly cultured skeletal muscle satellite cells using serum from WT and MF+/- sheep. We found that serum from MF+/- sheep promoted the proliferation of skeletal muscle satellite cells (Figure S4D). MSTN_Del273C mutation with _FGF5 knockout promoted FOSL1 expression using WT sheep serum (Figure S4E), which was similar to the results of FBS culture and HS induction. The serum from MF+/- sheep strongly stimulated FOSL1 expression and the inhibition of MyoD1 (Figure S4F). These results indicate that serum regulation cannot be ignored after MSTN_Del273C mutation with _FGF5 knockout.

      - With above suggested additional experiments, authors would also be able to demonstrate, whether Fosl1 is indeed triggered in response to myostatin and/or FGF5 signaling.

      To determine whether FOSL1 indeed acts downstream of MSTN, we supplemented the expression levels of FOSL1 under serum regulation to support our conclusions. We found that the serum from MF+/- sheep strongly stimulated FOSL1 expression and the inhibition of MyoD1 (Figure S4F).

      - Authors used t-test despite in several tests despite low sample number, which violates as such the assumption of equal variance. Non-parametric tests should be used in this case.

      Thank you very much for your valuable comments. We apologize for the previous incorrect use of statistical methods. In the revised version, we have re-analyzed all data. Before performing student’s t-test, we first evaluated the assumptions of normal distribution and equal variance. Two-tailed student’s t-tests were used only for data that conformed to normal distribution and homogeneity of variance, otherwise corrected Welch's t-tests were performed.

      - Authors should state in the legends which statistical test was used.

      Thank you for your suggestion. We have clearly stated the statistical testing method used in all figure legends, which is indeed necessary and important.

      In general, this manuscript should be dramatically scaled back in terms of content, eliminating unnecessary text, figures and tables that do not play a significant role in the findings that were significant. There is some interesting information and data here that can add to the overall base of knowledge surrounding the production of genetically engineered livestock in which myostatin has been targeted for mutation. However, the authors need to focus on their findings that were significant and strongly supported by the data and statistical analysis. Some discussion of findings that support their ideas/hypothesis, but are not statistically significant is fine. But it should not make up the majority of the manuscript which is the case here.

      Thank you for your valuable suggestions, which are essential for improving the quality of our manuscript. We have greatly streamlined and significantly revised the manuscript, removed unnecessary text, figures, and tables.

    1. eLife assessment

      This study presents solid results to demonstrate that arpin is expressed in the endothelium of blood vessels and that its deficiency leads to leaky blood vessels in in vivo and in vitro models. The work does not yet clarify the mechanistic connection between arpin and increased ROCK activity. The study adds some insights to our understanding of the complicated network of proteins that control this process, and it will be useful to individuals within this defined field of study.

    2. Reviewer #1 (Public Review):

      Summary:

      The data clearly demonstrate that arpin is important for vessel barrier function, yet its genetic loss via a CRISPR strategy was not lethality, but led to viable animals in C57Blk strain at 12 weeks of age, albeit with leaky blood vessels. Pharmacological approaches were employed to demonstrate that loss of arpin led to ROCK1-dependent stress fiber formation that promoted increased permeability.

      Strengths:

      The results clearly demonstrate that arpin is expressed in the endothelium of blood vessels and its deficiency leads to leaky blood vessels in in vivo and in vitro models.

      Weaknesses:

      They conclude vessel leak was not related to enhanced Arp2/3 function through arpin deficiency, but no direct evidence of Arp2/3 activity is provided to support this conclusion. Instead, the authors concluded that ROCK1 activity was elevated in arpin knockdown cells and caused robust stress fiber formation. This idea could be strengthened by testing if ROCK1 inhibition by pharmacological block in arpin KO mice leads to less vascular leakage while pharmacological inhibition of Arp2/3 does not attenuate increased vessel permeability.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors have taken their previous finding that arpin is important for epithelial junctions and extended this to endothelial cells. They find that the positive effects of arpin on endothelial junctions are not dependent on Arp2/3 activity but instead on suppression of actinomyosin contractility.

      Strengths:

      The study uses standard approaches to test each of the components in the model. The quality of the experimental work is good and the amount of experimental evidence is sufficient to support this straightforward story.

      Weaknesses:

      The major weakness is that the story is a simple extension of the previous work on arpin and junctions in epithelial cells. The additional information is that the effects are not via Arp2/3 directly, but instead through an increase in actinomyosin contractility. However, the connection between arpin and increased ROCK activity is not revealed.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the Authors):

      Arpin is a negative regulator of Arp2/3 activity. Here the authors investigated the role of arpin in vascular permeability using appropriate cultured human and murine endothelial monolayers and successfully developed an arpin KO mice. The results clearly show arpin is expressed in blood vessels (not clear about lymphatics but given leaky vessels, one wonders). The data show that arpin is important for vessel barrier function yet its genetic loss still leads to viable animals in the C57Blk strain albeit with leaky blood vessels. The data are well presented and controls are included. However, the evidence that arpin loss/knockdown causes increased actin functions independent of Arp2/3 is based on pharmacological data and is indirect. Authors conclude ROCK1 activity is elevated and the cause of lost barrier function by arpin reduction. I do have one suggestion for the authors that involves a new study in these animals, which could strengthen their proposed mechanism that the vascular defects are independent of Arp2/3 activity and rather involve ROCK1 but not ZIPK.

      (1) If arpin is working via ROCK1, as the authors infer, perhaps treatment of arpin-/- mice with ROCK1 inhibitor(s) would attenuate vessel permeability while HS38 treatment would not. This type of study would strengthen the conclusion that ROCK1, but not ZIPK, was involved. Including CK666 if active in mouse cells, could also be tested.

      To analyze vascular permeability in vivo, we performed Miles assays in arpin+/+ and arpin-/- mice using the inhibitors of ROCK1 (Y27632) and ZIPK (HS38). Both Y27632 and HS38 reduced the permeability caused by absence of arpin (new Figure 8E), thus confirming what we observed before in HUVEC (shown in old Figure 7). CK666 did not change the permeability in arpin-/- mice, thus confirming the conclusion that arpin does not regulate vascular permeability via Arp2/3 but rather via ROCK1/ZIPK-mediated stress fiber formation (page 13).

      (2) Fig 5. Data demonstrate that Arpin regulates actin filament formations and permeability in HUVEC, but this does not demonstrate its occurring in an Arp2/3-independent manner. If I understand your data this is indirect evidence. One needs more information to reach this conclusion. Can authors measure Arp2/3 directly and then test whether arpin knockdown and CK666 have the same capacity to reduce Arp2/3 activity in vitro.

      Arp2/3 activity cannot be measured directly. The commonly used approach is therefore Arp2/3 inhibition via CK666. Our new in vivo permeability assays (see answer above) together with our HUVEC data in Figure 5 clearly show that CK666 does not have the same effect as arpin knock-down, and neither does CK666 rescue the effects of arpin deficiency in vitro and in vivo. Together, these findings clearly suggest that arpin does not regulate endothelial permeability via Arp2/3.

      Minor issues:

      Fig 2, 3 or other Figs: In presented western blots, all proteins should include appropriate mw labels.

      Thank you. Molecular weights have been added to all Western blots.

      Fig 2. Suggest that like your arpin analysis, amounts of AP1AP and PICK1 at baseline and TNF-treatment by blotting should be included. A minor point is yellow color for labels does not stand out and should be changed to another color - as the authors used in Fig 2C.

      We have included Western blots and quantifications for PICK1 in Figure S1A and S1C. An antibody against AP1AP was unfortunately not available.

      The yellow color has been changed to purple for better visibility.

      Fig 2C. The arpin loss at junctions and actin filaments (Figure 2C) is very minor even though it reached statistical significance. It really is not an obvious loss from your 3 color overlay.

      Thank you. It is indeed hard to see. We included now magnifications in Figure 2C that better show the loss of arpin at junctions.

      Fig 8, text 303-310 shows in vivo evidence of lung congestion and edema. Also appear to be inflammatory cells present in images. If these are inflammatory cells, it begs the question if these mice have an abnormal complete blood cell count (CBC). Suggest adding CBC data for arpin-/- vs control arpin +/+ mice in Fig 8.

      The pathologist observed the presence of lymphocytes and macrophages, indicating the possibility of a (low level) chronic inflammation in arpin-deficient lungs. However, we now also performed hemograms of the mice (new Table S2) that showed no significant difference in the blood cell count of arpin-/- and arpin+/+ mice. Thus, the presence of lymphocytes and macrophages cannot be explained simply by higher leukocyte counts (page 14).

      Line 289, pg 13, Fig 8: Lung levels of arpin are not shown in Fig 8B. Authors must mean another fig?

      Sorry. Arpin protein levels in lungs are shown in figure 8C. This has been corrected on page 13.

      Reviewer #2 (Recommendations For The Authors):

      This is a solid piece of work that adds a small amount of additional factual information to our understanding of cell-cell junctions. The experimental work is of good quality and is sufficient to support the aims of the paper. I think the value of the work is to add this small amount of new knowledge to the archive. I do not believe that further experimental work would add to the paper - it's done. But this doesn't have the impact or completeness for this journal. It belongs in a for-the-record journal.

      We appreciate your overall positive evaluation and your comments that our study represents a solid piece of work with good quality experimental work. However, we are not sure what you mean by “it belongs in a for-the-record journal”. Anyway, we agree that our study does not reveal a complete mechanism of how arpin regulates actin stress fibers, but we respectfully disagree that our study only adds a “small amount of additional factual information”. We may not have been very clear about it, but we present in this study several new discoveries and although some are descriptive in nature that does not make them trivial or less important. We provide for the first time experimental evidence that: 1) arpin is expressed in endothelial cells in vitro and in vivo, and downregulated during inflammation; 2) presence of arpin is required for proper endothelial permeability regulation and junction architecture; 3) arpin exerts these functions in an Arp2/3-independent manner; 4) arpin controls actomyosin contractility in a ROCK1- and ZIPK-dependent fashion; 5) arpin knock-out mice are viable and breed and develop normally but show histological characteristics of a vascular phenotype and increased vascular permeability that can be rescued by inhibition of ROCK1 and ZIPK. The fact that arpin fulfills its functions in endothelial cells independently of the Arp2/3 complex is of special relevance as previously the only known function of arpin was the inhibition of the Arp2/3 complex. Thus, we believe that our study adds a significant amount of new information to the literature. Thank you very much.

    1. Reviewer #1 (Public Review):

      Summary:

      BMP signaling is, arguably, best known for its role in the dorsoventral patterning, but not in nematodes, where it regulates body size. In their paper, Vora et al. analyze ChIP-Seq and RNA-Seq data to identify direct transcriptional targets of SMA-3 (Smad) and SMA-9 (Schnurri) and understand the respective roles of SMA-3 and SMA-9 in the nematode model Caenorhabditis elegans. The authors use publicly available SMA-3 and SMA-9 ChIP-Seq data, own RNA-Seq data from SMA-3 and SMA-9 mutants, and bioinformatic analyses to identify the genes directly controlled by these two transcription factors (TFs) and find approximately 350 such targets for each. They show that all SMA-3-controlled targets are positively controlled by SMA-3 binding, while SMA-9-controlled targets can be either up or downregulated by SMA-9. 129 direct targets were shared by SMA-3 and SMA-9, and, curiously, the expression of 15 of them was activated by SMA-3 but repressed by SMA-9. Since genes responsible for cuticle collagen production were eminent among the SMA-3 targets, the authors focused on trying to understand the body size defect known to be elicited by the modulation of BMP signaling. Vora et al. provide compelling evidence that this defect is likely to be due to problems with the BMP signaling-dependent collagen secretion necessary for cuticle formation.

      Strengths:

      Vora et al. provide a valuable analysis of ChIP-Seq and RNA-Seq datasets, which will be very useful for the community. They also shed light on the mechanism of the BMP-dependent body size control by identifying SMA-3 target genes regulating cuticle collagen synthesis and by showing that downregulation of these genes affects body size in C. elegans.

      Weaknesses:

      (1) Although the analysis of the SMA-3 and SMA-9 ChIP-Seq and RNA-Seq data is extremely useful, the goal "to untangle the roles of Smad and Schnurri transcription factors in the developing C. elegans larva", has not been reached. While the role of SMA-3 as a transcriptional activator appears to be quite straightforward, the function of SMA-9 in the BMP signaling remains obscure. The authors write that in SMA-9 mutants, body size is affected, but they do not show any data on the mechanism of this effect.

      (2) The authors clearly show that both TFs can bind independently of each other, however, by using distances between SMA-3 and SMA-9 ChIP peaks, they claim that when the peaks are close these two TFs act as complexes. In the absence of proof that SMA-3 and SMA-9 physically interact (e.g. that they co-immunoprecipitate - as they do in Drosophila), this is an unfounded claim, which should either be experimentally substantiated or toned down.

      (3) The second part of the paper (the collagen story) is very loosely connected to the first part. dpy-11 encodes an enzyme important for cuticle development, and it is a differentially expressed direct target of SMA-3. dpy-11 can be bound by SMA-9, but it is not affected by this binding according to RNA-Seq. Thus, technically, this part of the paper does not require any information about SMA-9. However, this can likely be improved by addressing the function of the 15 genes, with the opposing mode of regulation by SMA-3 and SMA-9.

      (4) The Discussion does not add much to the paper - it simply repeats the results in a more streamlined fashion.

    2. Reviewer #2 (Public Review):

      In the present study, Vora et al. elucidated the transcription factors downstream of the BMP pathway components Smad and Schnurri in C. elegans and their effects on body size. Using a combination of a broad range of techniques, they compiled a comprehensive list of genome-wide downstream targets of the Smads SMA-3 and SMA-9. They found that both proteins have an overlapping spectrum of transcriptional target sites they control, but also unique ones. Thereby, they also identified genes involved in one-carbon metabolism or the endoplasmic reticulum (ER) secretory pathway. In an elaborate effort, the authors set out to characterize the effects of numerous of these targets on the regulation of body size in vivo as the BMP pathway is involved in this process. Using the reporter ROL-6::wrmScarlet, they further revealed that not only collagen production, as previously shown, but also collagen secretion into the cuticle is controlled by SMA-3 and SMA-9. The data presented by Vora et al. provide in-depth insight into the means by which the BMP pathway regulates body size, thus offering a whole new set of downstream mechanisms that are potentially interesting to a broad field of researchers.

      The paper is mostly well-researched, and the conclusions are comprehensive and supported by the data presented. However, certain aspects need clarification and potentially extended data.

      (1) The BMP pathway is active during development and growth. Thus, it is logical that the data shown in the study by Vora et al. is based on L2 worms. However, it raises the question of if and how the pattern of transcriptional targets of SMA-3 and SMA-9 changes with age or in the male tail, where the BMP pathway also has been shown to play a role. Is there any data to shed light on this matter or are there any speculations or hypotheses?

      (2) As it was shown that SMA-3 and SMA-9 potentially act in a complex to regulate the transcription of several genes, it would be interesting to know whether the two interact with each other or if the cooperation is more indirect.

      (3) It would help the understanding of the data even more if the authors could specifically state if there were collagens among the genes regulated by SMA-3 and SMA-9 and which.

      (4) The data on the role of SMA-3 and SMA-9 in the regulation of the secretion of collagens from the hypodermis is highly intriguing. The authors use ROL-6 as a reporter for the secretion of collagens. Is ROL-6 a target of SMA-9 or SMA-3? Even if this is not the case, the data would gain even more strength if a comparable quantification of the cuticular levels of ROL-6 were shown in Figure 6, and potentially a ratio of cuticular versus hypodermal levels. By that, the levels of secretion versus production can be better appreciated.

      (5) It is known that the BMP pathway controls several processes besides body size. The discussion would benefit from a broader overview of how the identified genes could contribute to body size. The focus of the study is on collagen production and secretion, but it would be interesting to have some insights into whether and how other identified proteins could play a role or whether they are likely to not be involved here (such as the ones normally associated with lipid metabolism, etc.).

    3. eLife assessment

      This study presents valuable findings that will allow for a better understanding of the targets of SMAD and Schnurri, transcription factors that act downstream in the BMP signalling pathway. The evidence presented in this manuscript is solid, but because the claims of a SMA-3/SMA-9 complex are not experimentally supported, they should be toned down. Revising the discussion to give a broader context of BMP-driven body size control would help the readers put this work in a larger context. This work will be of broad interest to colleagues studying BMP signalling across phyla.

    4. Author response:

      Reviewer #1 (Public Review):

      Summary: 

      BMP signaling is, arguably, best known for its role in the dorsoventral patterning, but not in nematodes, where it regulates body size. In their paper, Vora et al. analyze ChIP-Seq and RNA-Seq data to identify direct transcriptional targets of SMA-3 (Smad) and SMA-9 (Schnurri) and understand the respective roles of SMA-3 and SMA-9 in the nematode model Caenorhabditis elegans. The authors use publicly available SMA-3 and SMA-9 ChIP-Seq data, own RNA-Seq data from SMA-3 and SMA-9 mutants, and bioinformatic analyses to identify the genes directly controlled by these two transcription factors (TFs) and find approximately 350 such targets for each. They show that all SMA-3-controlled targets are positively controlled by SMA-3 binding, while SMA-9-controlled targets can be either up or downregulated by SMA-9. 129 direct targets were shared by SMA-3 and SMA-9, and, curiously, the expression of 15 of them was activated by SMA-3 but repressed by SMA-9. Since genes responsible for cuticle collagen production were eminent among the SMA-3 targets, the authors focused on trying to understand the body size defect known to be elicited by the modulation of BMP signaling. Vora et al. provide compelling evidence that this defect is likely to be due to problems with the BMP signaling-dependent collagen secretion necessary for cuticle formation. 

      We thank the reviewer for this supportive summary. We would like to clarify the status of the publicly available ChIP-seq data. We generated the GFP tagged SMA-3 and SMA‑9 strains and submitted them to be entered into the queue for ChIP-seq processing by the modENCODE (later modERN) consortium. Due to the nature of the consortium’s funding, the data were required to be released publicly upon completion. Nevertheless, we have provided the first comprehensive analysis of these datasets.

      Strengths: 

      Vora et al. provide a valuable analysis of ChIP-Seq and RNA-Seq datasets, which will be very useful for the community. They also shed light on the mechanism of the BMP-dependent body size control by identifying SMA-3 target genes regulating cuticle collagen synthesis and by showing that downregulation of these genes affects body size in C. elegans. 

      Weaknesses: 

      (1) Although the analysis of the SMA-3 and SMA-9 ChIP-Seq and RNA-Seq data is extremely useful, the goal "to untangle the roles of Smad and Schnurri transcription factors in the developing C. elegans larva", has not been reached. While the role of SMA-3 as a transcriptional activator appears to be quite straightforward, the function of SMA-9 in the BMP signaling remains obscure. The authors write that in SMA-9 mutants, body size is affected, but they do not show any data on the mechanism of this effect. 

      We thank the reviewer for directing our attention to the lack of clarity about SMA-9’s function. We will revise the text to highlight what this study and others demonstrate about SMA-9’s role in body size. We also plan to analyze additional target genes to deepen our model for how SMA-3 and SMA-9 interact functionally to produce a given transcriptional response.

      (2) The authors clearly show that both TFs can bind independently of each other, however, by using distances between SMA-3 and SMA-9 ChIP peaks, they claim that when the peaks are close these two TFs act as complexes. In the absence of proof that SMA-3 and SMA-9 physically interact (e.g. that they co-immunoprecipitate - as they do in Drosophila), this is an unfounded claim, which should either be experimentally substantiated or toned down. 

      A physical interaction between Smads and Schnurri has been amply demonstrated in other systems. The limitation in the previous work is that only a small number of target genes was analyzed. Our goal in this study was to determine how widespread this interaction is on a genomic scale.  Our analyses demonstrate for the first time that a Schnurri transcription factor has significant numbers of both Smad-dependent and Smad-independent target genes. We will revise the text to clarify this point.

      (3) The second part of the paper (the collagen story) is very loosely connected to the first part. dpy-11 encodes an enzyme important for cuticle development, and it is a differentially expressed direct target of SMA-3. dpy-11 can be bound by SMA-9, but it is not affected by this binding according to RNA-Seq. Thus, technically, this part of the paper does not require any information about SMA-9. However, this can likely be improved by addressing the function of the 15 genes, with the opposing mode of regulation by SMA-3 and SMA-9. 

      We appreciate this suggestion and will clarify how SMA-9 and its target genes contribute to collagen organization and body size regulation.

      (4) The Discussion does not add much to the paper - it simply repeats the results in a more streamlined fashion. 

      We thank the reviewer for this suggestion. We will add more context to the Discussion.

      Reviewer #2 (Public Review): 

      In the present study, Vora et al. elucidated the transcription factors downstream of the BMP pathway components Smad and Schnurri in C. elegans and their effects on body size. Using a combination of a broad range of techniques, they compiled a comprehensive list of genome-wide downstream targets of the Smads SMA-3 and SMA-9. They found that both proteins have an overlapping spectrum of transcriptional target sites they control, but also unique ones. Thereby, they also identified genes involved in one-carbon metabolism or the endoplasmic reticulum (ER) secretory pathway. In an elaborate effort, the authors set out to characterize the effects of numerous of these targets on the regulation of body size in vivo as the BMP pathway is involved in this process. Using the reporter ROL-6::wrmScarlet, they further revealed that not only collagen production, as previously shown, but also collagen secretion into the cuticle is controlled by SMA-3 and SMA-9. The data presented by Vora et al. provide in-depth insight into the means by which the BMP pathway regulates body size, thus offering a whole new set of downstream mechanisms that are potentially interesting to a broad field of researchers.

      The paper is mostly well-researched, and the conclusions are comprehensive and supported by the data presented. However, certain aspects need clarification and potentially extended data. 

      (1) The BMP pathway is active during development and growth. Thus, it is logical that the data shown in the study by Vora et al. is based on L2 worms. However, it raises the question of if and how the pattern of transcriptional targets of SMA-3 and SMA-9 changes with age or in the male tail, where the BMP pathway also has been shown to play a role. Is there any data to shed light on this matter or are there any speculations or hypotheses? 

      We agree that these are intriguing questions and we are interested in the roles of transcriptional targets at other developmental stages and in other physiological functions, but these analyses are beyond the scope of the current study.

      (2) As it was shown that SMA-3 and SMA-9 potentially act in a complex to regulate the transcription of several genes, it would be interesting to know whether the two interact with each other or if the cooperation is more indirect. 

      A physical interaction between Smads and Schnurri has been amply demonstrated in other systems. Our goal in this study was not to validate this physical interaction, but to analyze functional interactions on a genome-wide scale.

      (3) It would help the understanding of the data even more if the authors could specifically state if there were collagens among the genes regulated by SMA-3 and SMA-9 and which. 

      We thank the reviewer for this suggestion and will add the requested information in the text.

      (4) The data on the role of SMA-3 and SMA-9 in the regulation of the secretion of collagens from the hypodermis is highly intriguing. The authors use ROL-6 as a reporter for the secretion of collagens. Is ROL-6 a target of SMA-9 or SMA-3? Even if this is not the case, the data would gain even more strength if a comparable quantification of the cuticular levels of ROL-6 were shown in Figure 6, and potentially a ratio of cuticular versus hypodermal levels. By that, the levels of secretion versus production can be better appreciated. 

      rol-6 has been identified as a transcriptional target of this pathway. The level of ROL-6 protein, however, is not changed in sma-3 and sma-9 mutants, indicating that there is post-transcriptional compensation. We will include these data in the revised manuscript.

      (5) It is known that the BMP pathway controls several processes besides body size. The discussion would benefit from a broader overview of how the identified genes could contribute to body size. The focus of the study is on collagen production and secretion, but it would be interesting to have some insights into whether and how other identified proteins could play a role or whether they are likely to not be involved here (such as the ones normally associated with lipid metabolism, etc.). 

      We will add this information to the Discussion.

    1. eLife assessment

      This study describes useful mouse models of knock-ins of human STING1 variants and an assessment of these variants' action in mouse immune cells. While the implications of the variants in the inflammatory response are of significant interest, limitations are still found in the authors' interpretation and conclusions made, and the evidence for the conclusion remains incomplete.

    2. Reviewer #1 (Public Review):

      Summary:

      This manuscript by Aybar-Torres et al investigated the effect of common human STING1 variants on STING-mediated T cell phenotypes in mice. The authors previously made knock-in mice expressing human STING1 alleles HAQ or AQ, and here they established a new knock-in line Q293. The authors stimulated cells isolated from these mice with STING agonists and found that all three human mutant alleles resist cell death, leading to the conclusion that R293 residue is essential for STING-mediated cell death (there are several caveats with this conclusion, more below). The authors also bred HAQ and AQ alleles to the mouse Sting1-N153S SAVI mouse and observed varying levels of rescue of disease phenotypes with the AQ allele showing more complete rescue than the HAQ allele. The Q293 allele was not tested in the SAVI model. They conclude that the human common variants such as HAQ and AQ have a dominant negative effect over the gain-of-function SAVI mutants.

      Strengths:

      The authors and Dr. Jin's group previously made important observations of common human STING1 variants, and these knock-in mouse models are essential for understanding the physiological function of these alleles.

      Weaknesses:

      However, although some of the observations reported here are interesting, the data collectively does not support a unified model. The authors seem to be drawing two sets of conclusions from in vitro and in vivo experiments, and neither mechanism is clear. Several experiments need better controls, and these knock-in mice need more comprehensive functional characterization.

    3. Reviewer #2 (Public Review):

      Aybar-Torres and colleagues utilize common human STING alleles to dissect the mechanism of SAVI inflammatory disease. The authors demonstrate that these common alleles alleviate SAVI pathology in mice, and perhaps more importantly use the differing functionality of these alleles to provide insight into requirements of SAVI disease induction. Their findings suggest that it is residue A230 and/or Q293 that are required for SAVI induction, while the ability to induce an interferon-dependent inflammatory response is not. This is nicely exemplified by the AQ/SAVI mice that have an intact inflammatory response to STING activation, yet minimal disease progression. As both mutants seem to be resistant STING-dependent cell death, this manuscript also alludes to the importance of STING-dependent cell death, rather than STING-dependent inflammation, in the progression of SAVI pathology. I believe this manuscript makes some important connections between STING pathology mouse models and human genetics that would contribute to the field.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Summary Responses: Besides the WT allele, equivalent to the mouse TMEM173 gene, the human TMEM173 gene has two common alleles: the HAQ and AQ alleles carried by billions of people. The main conclusions and interpretation, summarized in the Title and Abstract, are i) Different from the WT TMEM173 allele, the HAQ or AQ alleles are resistant to STING activation-induced cell death; ii) STING residue 293 is critical for cell death; iii) HAQ, AQ alleles are dominant to the SAVI allele; iv) One copy of the AQ allele rescues the SAVI disease in mice. We propose that STING research and STING-targeting immunotherapy should consider human TMEM173 heterogeneity. These interpretations and conclusions were based on Data and Logic. We welcome alternative, logical interpretations and collaborations to advance the human TMEM173 research.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This manuscript by Aybar-Torres et al investigated the effect of common human STING1 variants on STING-mediated T cell phenotypes in mice. The authors previously made knock-in mice expressing human STING1 alleles HAQ or AQ, and here they established a new knock-in line Q293. The authors stimulated cells isolated from these mice with STING agonists and found that all three human mutant alleles resist cell death, leading to the conclusion that R293 residue is essential for STING-mediated cell death (there are several caveats with this conclusion, more below). The authors also bred HAQ and AQ alleles to the mouse Sting1-N153S SAVI mouse and observed varying levels of rescue of disease phenotypes with the AQ allele showing more complete rescue than the HAQ allele. The Q293 allele was not tested in the SAVI model. They conclude that the human common variants such as HAQ and AQ have a dominant negative effect over the gain-of-function SAVI mutants.

      Strengths:

      The authors and Dr. Jin's group previously made important observations of common human STING1 variants, and these knock-in mouse models are essential for understanding the physiological function of these alleles.

      Weaknesses:

      However, although some of the observations reported here are interesting, the data collectively does not support a unified model. The authors seem to be drawing two sets of conclusions from in vitro and in vivo experiments, and neither mechanism is clear. Several experiments need better controls, and these knock-in mice need more comprehensive functional characterization.

      (1) In Figure 1, the authors are trying to show that STING agonist-induced splenocytes cell death is blocked by HAQ, AQ and Q alleles. The conclusion at line 134 should be splenocytes, not lymphocytes. Most experiments in this figure were done with mixed population that may involve cell-to-cell communication. Although TBK1-dependence is likely, a single inhibitor treatment of a mixed population is not sufficient to reach this conclusion.

      We greatly appreciate Reviewer 1's insights. We changed the “lymphocytes” to “splenocytes” (line 133) as suggested. We respectfully disagree with Reviewer 1’s comments on TBK1. First, we used two different TBK1 inhibitors: BX795 and GSK8612. Second, because BX795 also inhibits PDK1, we used a PDK1 inhibitor GSK2334470; Third, both BX795 and GSK8612 completely inhibited diABZI-induced splenocyte cell death (Figure 1B) (lines 128 – 133). The logical conclusion is “TBK1 activation is required for STING-mediated mouse spleen cell death ex vivo”. (line 117).

      Our discovery that the common human TMEM173 alleles are resistant to STING activation-induced cell death is a substantial finding. It further strengthens the argument that the HAQ and AQ alleles are functionally distinct from the WT allele 1-3. We wish to underscore the crucial message of this study-that 'STING research and STING-targeting immunotherapy should consider TMEM173 heterogeneity in humans' (line 37), which has been largely overlooked in current STING clinical trials 4.

      Regarding STING-Cell death, as we stated in the Introduction (lines 65-77). i) STING-mediated cell death is cell type-dependent 5-7 and type I IFNs-independent 5,7,8. ii) The in vivo biological significance of STING-mediated cell death is not clear 7,8. iii) The mechanisms of STING-Cell death remain controversial. Multiple cell death pathways, i.e., apoptosis, necroptosis, pyroptosis, ferroptosis, and PANoptosis, are proposed 7,9,10. SAVI/HAQ, SAVI/AQ prevented lymphopenia and alleviated SAVI disease in mice. Thus, the manuscript provides some answers to the biological significance of STING-cell death in vivo, which is new. Regarding the molecular mechanism, splenocytes from Q293/Q293 mice are resistant to STING cell death. The logical conclusion is that the amino acid 293 is critical for STING cell death (line 29).

      Extensive studies are needed, beyond the scope of this manuscript, on how aa293 and TBK1 mediates STING-Cell death to resolve the controversies in the STING-cell death fields (e.g. apoptosis, necroptosis, pyroptosis, ferroptosis, and PANoptosis).

      (2) Q293 knock-in mouse needs to be characterized and compared to HAQ and AQ. Is this mutant expressed in tissues? Does this mutant still produce IFN and other STING activities? Does the protein expression level altered on Western blot? Is the mutant protein trafficking affected? In the authors' previous publications and some of the Western blot here, expression levels of each of these human STING1 protein in mice are drastically different. HAQ and AQ also have different effects on metabolism (pmid: 36261171), which could complicate interoperation of the T cell phenotypes.

      These are very important questions that require rigorous investigations that are beyond the scope of this manuscript. This manuscript, titled “The common TMEM173 HAQ, AQ alleles rescue CD4 T cellpenia, restore T-regs, and prevent SAVI (N153S) inflammatory disease in mice” does not focus on Q293 mice. We have been investigating the common human TMEM173 alleles since 2011 from the discovery 11 , mouse model 1,3, human clinical trial 2, and human genetics studies 3. This manuscript is another step towards understanding these common human TMEM173 alleles with the new discovery that HAQ, AQ alleles are resistant to STING cell death.

      (3) HAQ/WT and AQ/WT splenocytes are protected from STING agonist-induced cell death equally well (Figure 1G). HAQ/SAVI shows less rescue compared to AQ/SAVI. These are interesting observations, but mechanism is unclear and not clearly discussed. E.g., how does AQ protect disease pathology better than HAQ (that contains AQ)? Does Q293 allele also fully rescue SAVI?

      In this manuscript, Figure 6 shows AQ/SAVI had more T-regs than HAQ/SAVI (lines 251 – 261). In our previous publication on HAQ, AQ knockin mice, we showed that AQ T-regs have more IL-10 than HAQ T-regs 3. Thus, increased IL-10+ Tregs in AQ mice may contribute to an improved phenotype in AQ/SAVI compared to HAQ/SAVI. However, we are not excluding other contributions (e.g. metabolic difference) (lines 332-335). We are exploring these possibilities.  

      (4) Figure 2 feels out of place. First of all, why are the authors using human explant lung tissues? PBMCs should be a better source for lymphocytes. In untreated conditions, both CD4 and B cells show ~30% dying cells, but CD8 cells show 0% dying cells. This calls for technical concerns on the CD8 T cell property or gating strategy because in the mouse experiment (Figure 1A) all primary lymphocytes show ~30% cell death at steady-state. Second, Figure 2C, these type of partial effect needs multiple human donors to confirm. Three, the reconstitution of THP1 cells seems out of place. STING-mediated cell death mechanism in myeloid and lymphoid cells are likely different. If the authors want to demonstrate cell death in myeloid cells using THP1, then these reconstituted cell lines need to be better validated. Expression, IFN signaling, etc. The parental THP1 cells is HAQ/HAQ, how does that compare to the reconstitutions? There are published studies showing THP1-STING-KO cells reconstituted with human variants do not respond to STING agonists as expected. The authors need to be scientifically rigorous on validation and caution on their interpretations.

      Figure 2 is necessary because it reveals the difference between mouse and human STING cell death, which is critical to understand STING in human health and diseases (lines 160-161). Figure 2A-2B showed that STING activation killed human CD4 T cells, but not human CD8 T cells or B cells. This observation is different from Figure 1A, where STING activation killed mouse CD4, CD8 T cells, and CD19 B cells, revealing the species-specific STING cell death responses. Regarding human CD8 T cells, as we stated in the Discussion (lines 323-325), human CD8 T cells (PBMC) are not as susceptible as the CD4 T cells to STING-induced cell death 8. We used lung lymphocytes that showed similar observations (Figure 2A). For Figure 2C, we used 2 WT/HAQ and 3 WT/WT individuals (lines 738-739). We generate HAQ, AQ THP-1 cells in STING-KO THP-1 cells (Invivogen,, cat no. thpd-kostg) (lines 380-387).

      A recent study found that a new STING agonist SHR1032 induces cell death in STING-KO THP-1 cells expressing WT(R232) human STING 10 (line 182). SHR1032 suppressed THP1-STING-WT(R232) cell growth at GI50: 23 nM while in the parental THP1-STING-HAQ cells, the GI50 of SHR1032 was >103 nM 10. Cytarabine was used as an internal control where SHR1032 killed more robustly than cytarabine in the THP1-STING-WT(R232) cells but much less efficiently than cytarabine in the THP-1-STING-HAQ cells 10. 

      Our manuscript rigorously uses mouse splenocytes, human lung lymphocytes, THP-1 reconstituted with HAQ, AQ, and HAQ/SAVI, AQ/SAVI mice, to demonstrate that the common human HAQ, AQ alleles are resistant to STING cell death in vitro and in vivo.

      We agree with Reviewer 1 that STING-mediated cell death mechanisms in myeloid and lymphoid cells may be different and likely contribute to the different mechanisms proposed in STING cell death research 7,9,10. Our study focuses on the in vivo STING-mediated T cellpenia.

      (5) Figure 2G, H, I are confusing. AQ is more active in producing IFN signaling than HAQ and Q is the least active. How to explain this?

      We stated in the Introduction that “AQ responds to CDNs and produce type I IFNs in vivo and in vitro 3,12,13 ”(line 92-93). We reported that the AQ knock in mice responded to STING activation 3. We previously showed that there was a negative natural selection on the AQ allele in individuals outside of Africa 3. 28% of Africans are WT/AQ but only 0.6% East Asians are WT/AQ 3. In contrast, the HAQ allele was positively selected in non-Africans 3. Investigation to understand the mechanisms and biological significance of these naturally selected human TMEM173 alleles has been ongoing in the lab.

      (6) The overall model is unclear. If HAQ, AQ and Q are loss-of-function alleles and Q is the key residue for STING-mediated cell death, then why AQ is the most active in producing IFN signaling and AQ/SAVI rescues disease most completely? If these human variants act as dominant negatives, which would be consistent with the WT/het data, then how do you explain AQ is more dominant negative than HAQ?

      In this manuscript, Figure 6 shows AQ/SAVI had more T-regs than HAQ/SAVI (lines 251 – 261). In our previous publication on HAQ, AQ knockin mice, we showed that AQ T-regs have more IL-10 and mitochondria activity than HAQ T-regs 3. Nevertheless, we are not excluding other contributions (e.g. metabolic difference) by the AQ allele (lines 332-335). Last, we used modern human evolution to discover the dominance of these common human STING alleles. In modern humans outside Africans, HAQ was positively selected while AQ was negatively selected 3. However, AQ is likely dominant to HAQ because there is no HAQ/AQ individuals outside Africa. The genetic dominance of common human TMEM173 allele is a new concept. More investigation is ongoing.

      (7) As a general note, SAVI disease phenotypes involve multiple cell types. Lymphocyte cell death is only one of them. The authors' characterization of SAVI pathology is limited and did not analyze immunopathology of the lung.

      Both radioresistant parenchymal and/or stromal cells and hematopoietic cells influence SAVI pathology in mice 14,15. Nevertheless, the lack of CD 4 T cells, including the anti-inflammatory T-regs, likely contributes to the inflammation in SAVI mice and patients 16. We characterized lung function, lung inflammation (Figure 4), lung neutrophils, and inflammatory monocyte infiltration (Figure S5) (lines 232-235).

      (8) Line 281, the discussion on HIV T cell death mechanism is not relevant and over-stretching. This study did not evaluate viral infection in T cells at all. The original finding of HAQ/HAQ enrichment in HIV/AIDS was 2/11 in LTNP vs 0/11 in control, arguably not the strongest statistics.

      Several publications have linked STING to HIV pathogenesis 17-22  (line 271). CD4 T cellpenia is a hallmark of AIDS. The manuscript studies STING activation-induced T cellpenia in vivo. It is not stretching to ask, for example, does preventing STING T cell death (e.g HAQ, AQ alleles) can restore CD4 T cell counts and improve care for AIDS patients?

      Reviewer #2 (Public Review):

      Aybar-Torres and colleagues utilize common human STING alleles to dissect the mechanism of SAVI inflammatory disease. The authors demonstrate that these common alleles alleviate SAVI pathology in mice, and perhaps more importantly use the differing functionality of these alleles to provide insight into requirements of SAVI disease induction. Their findings suggest that it is residue A230 and/or Q293 that are required for SAVI induction, while the ability to induce an interferon-dependent inflammatory response is not. This is nicely exemplified by the AQ/SAVI mice that have an intact inflammatory response to STING activation, yet minimal disease progression. As both mutants seem to be resistant STING-dependent cell death, this manuscript also alludes to the importance of STING-dependent cell death, rather than STING-dependent inflammation, in the progression of SAVI pathology. While I have some concerns, I believe this manuscript makes some important connections between STING pathology mouse models and human genetics that would contribute to the field.

      Some points to consider:

      (1) While the CD4+ T cell counts from HAQ/SAVI and AQ/SAVI mice suggest that these T cells are protected from STING-dependent cell death, an assay that explores this more directly would strengthen the manuscript. This is also supported by Fig 2C, but I believe a strength of this manuscript is the comparison between the two alleles. Therefore, if possible, I would recommend the isolation of T cells from these mice and direct stimulation with diABZI or other STING agonist with a cell death readout.

      Please see the new Figure S3 for cell death by diABZI, DMXAA in Splenocytes from WT/WT, WT/HAQ, HAQ/SAVI, AQ/SAVI mice. The HAQ/SAVI and AQ/SAVI splenocytes showed similar partial resistance to STING activation-induced cell death (lines 214-216).

      (2) Related to the above point - further exemplifying that the Q293 locus is essential to disease, even in human cells, would also strengthen the paper. It seems that CD4 T cell loss is a major component of human SAVI. While not co_mpletely necessary, repeating the THP1 cell death experiments from Fig 2 with a human T cell line would round out the study nicely._

      We examined HAQ, AQ mouse splenocytes, HAQ human lung lymphocytes, THP-1 reconstituted with HAQ, AQ, and HAQ/SAVI, AQ/SAVI mice, to demonstrate that the common human HAQ, AQ alleles are resistant to STING cell death in vitro and in vivo. Additional human T cell line work does not add too much. We hope to conduct more human PBMC or lung lymphocytes STING cell death experiments from HAQ, AQ individuals as we continue the human STING alleles investigation.

      (3) While I found the myeloid cell counts and BMDM data interesting, I think some more context is needed to fully loop this data into the story. Is myeloid cell expansion exemplified by SAVI patients? Do we know if myeloid cells are the major contributors to the inflammation these patients experience? Why should the SAVI community care about the Q293 locus in myeloid cells?

      This is likely a misunderstanding. We use BMDM for the purpose of comparing STING signaling (TBK1, IRF3, NFkB, STING activation) by WT/SAVI, HAQ/SAVI, AQ/SAVI. Ideally, we would like to compare STING signaling in CD4 T cells from WT/SAVI to HAQ/SAVI, AQ/SAVI mice. However, WT/SAVI has no CD4 T cells. Doing so, we are making the assumption that the basic STING signaling (TBK1, IRF3, NFkB, STING activation) is conserved between T cells and macrophages.

      (4) The functional assays in Figure 4 are exciting and really connect the alleles to disease progression. To strengthen the manuscript and connect all the data, I would recommend additional readouts from these mice that address the inflammatory phenotype shown in vitro in Figure 5. For example, measuring cytokines from these mice via ELISA or perhaps even Western blots looking for NFkB or STING activation would be supportive of the story. This would also allow for some tissue specificity. I believe looking for evidence of inflammation and STING activation in the lungs of these mice, for example, would further connect the data to human SAVI pathology.

      Reviewer 2 suggests looking for evidence of inflammation and STING activation in the lungs of HAQ/SAVI, AQ/SAVI. We would like to elaborate further. First, anti-inflammatory treatments, e.g. steroids, DMARDs, IVIG, Etanercept (TNF), rituximab, Nifedipine, amlodipine, et al., all failed in SAVI patients 23. JAK inhibitors on SAVI had mixed outcomes (lines 55-58). Second, Figure S5 examined lung neutrophils and inflammatory monocyte infiltration. Interestingly, while AQ/SAVI mice had a better lung function than HAQ/SAVI mice (Figure 4D, 4E vs 4H, 4I), HAQ/SAVI and AQ/SAVI lungs had comparable neutrophils and inflammatory monocyte infiltration (Figure S5). Last, SAVI is classified as type I interferonopathy 23, but the lung diseases of SAVI are mainly independent of type I IFNs 24-27. The AQ allele suppresses SAVI in vivo.  Understanding the mechanisms by which AQ rescues SAVI may lead to curative care for SAVI patients.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      One suggestion is to streamline this study by focusing on STING-mediated cell death only in CD4 T cells. The authors can use in vitro PBMC isolated human T cells, ex vivo T cells from the knock-in mice, and in vivo T cells from the SAVI breeding. The current manuscript includes myeloid cell death, Tregs, complex SAVI disease pathology, which is too confusing and too complex to explain with the varying effect from the three human STING1 variants.

      We sincerely appreciate Reviewer 1’s suggestion. The goal of our human STING alleles research has always been translational, i.e. improving human health. Even as a monogenetic disease, the SAVI pathology is still complex. For example, thought as a type I Interferonopathy, SAVI is largely independent of type I IFNs. Similarly, STING-activation-induced cell death, while contribute to SAVI, is not the whole story, as the Reviewer pointed out in the Comment 3 & 6 &7. HAQ/SAVI mice still died early and had lung dysfunction (Figure 4). In contrast, AQ/SAVI mice restore lifespan and lung function. We had Figure 6 show different T-regs between AQ/SAVI and HAQ/SAVI mice. In addition, AQ mice had more IL-10+ T-regs than HAQ mice 3. Therefore, we are excited about developing AQ-based curative therapy for SAVI patients (preventing cell death and inducing immune tolerance).  Again, we thank the Reviewer for the suggestion. Additional research is ongoing.

      Reviewer #2 (Recommendations For The Authors):

      Minor points

      (1) Generation of THP1 cells with the human STING alleles is missing from methods.

      We added the protocol in the methods (lines 380-387). THP-1 KO line stable expressing WT STING was first described by Weikang Tao’s group 10.

      (2) Some abbreviations are not expanded (CDA).

      CDA is expanded as cyclic di-AMP (e.g. line 375).

      References.

      (1) Patel, S. et al. The Common R71H-G230A-R293Q Human TMEM173 Is a Null Allele. J Immunol 198, 776-787 (2017).

      (2) Sebastian, M. et al. Obesity and STING1 genotype associate with 23-valent pneumococcal vaccination efficacy. JCI Insight 5 (2020).

      (3) Mansouri, S. et al. MPYS Modulates Fatty Acid Metabolism and Immune Tolerance at Homeostasis Independent of Type I IFNs. J Immunol 209, 2114-2132 (2022).

      (4) Sivick, K. E. et al. Comment on "The Common R71H-G230A-R293Q Human TMEM173 Is a Null Allele". J Immunol 198, 4183-4185 (2017).

      (5) Gulen, M. F. et al. Signalling strength determines proapoptotic functions of STING. Nat Commun 8, 427 (2017).

      (6) Kabelitz, D. et al. Signal strength of STING activation determines cytokine plasticity and cell death in human monocytes. Sci Rep 12, 17827 (2022).

      (7) Murthy, A. M. V., Robinson, N. & Kumar, S. Crosstalk between cGAS-STING signaling and cell death. Cell Death Differ 27, 2989-3003 (2020).

      (8) Kuhl, N. et al. STING agonism turns human T cells into interferon-producing cells but impedes their functionality. EMBO Rep 24, e55536 (2023).

      (9) Li, C., Liu, J., Hou, W., Kang, R. & Tang, D. STING1 Promotes Ferroptosis Through MFN1/2-Dependent Mitochondrial Fusion. Front Cell Dev Biol 9, 698679 (2021).

      (10) Song, C. et al. SHR1032, a novel STING agonist, stimulates anti-tumor immunity and directly induces AML apoptosis. Sci Rep 12, 8579 (2022).

      (11) Jin, L. et al. Identification and characterization of a loss-of-function human MPYS variant. Genes Immun 12, 263-269 (2011).

      (12) Yi, G. et al. Single nucleotide polymorphisms of human STING can affect innate immune response to cyclic dinucleotides. PLoS One 8, e77846 (2013).

      (13) Patel, S. et al. Response to Comment on "The Common R71H-G230A-R293Q Human TMEM173 Is a Null Allele". J Immunol 198, 4185-4188 (2017).

      (14) Gao, K. M. et al. Endothelial cell expression of a STING gain-of-function mutation initiates pulmonary lymphocytic infiltration. Cell Rep 43, 114114 (2024).

      (15) Gao, K. M., Motwani, M., Tedder, T., Marshak-Rothstein, A. & Fitzgerald, K. A. Radioresistant cells initiate lymphocyte-dependent lung inflammation and IFNgamma-dependent mortality in STING gain-of-function mice. Proc Natl Acad Sci U S A 119, e2202327119 (2022).

      (16) Hu, W. et al. Regulatory T cells function in established systemic inflammation and reverse fatal autoimmunity. Nat Immunol 22, 1163-1174 (2021).

      (17) Monroe, K. M. et al. IFI16 DNA sensor is required for death of lymphoid CD4 T cells abortively infected with HIV. Science 343, 428-432 (2014).

      (18) Doitsh, G. et al. Cell death by pyroptosis drives CD4 T-cell depletion in HIV-1 infection. Nature 505, 509-514 (2014).

      (19) Jakobsen, M. R., Olagnier, D. & Hiscott, J. Innate immune sensing of HIV-1 infection. Curr Opin HIV AIDS 10, 96-102 (2015).

      (20) Silvin, A. & Manel, N. Innate immune sensing of HIV infection. Curr Opin Immunol 32, 54-60 (2015).

      (21) Altfeld, M. & Gale, M., Jr. Innate immunity against HIV-1 infection. Nat Immunol 16, 554-562 (2015).

      (22) Krapp, C., Jonsson, K. & Jakobsen, M. R. STING dependent sensing - Does HIV actually care? Cytokine Growth Factor Rev 40, 68-76 (2018).

      (23) Liu, Y. et al. Activated STING in a vascular and pulmonary syndrome. N Engl J Med 371, 507-518 (2014).

      (24) Luksch, H. et al. STING-associated lung disease in mice relies on T cells but not type I interferon. J Allergy Clin Immunol 144, 254-266 e258 (2019).

      (25) Stinson, W. A. et al. The IFN-gamma receptor promotes immune dysregulation and disease in STING gain-of-function mice. JCI Insight 7 (2022).

      (26) Warner, J. D. et al. STING-associated vasculopathy develops independently of IRF3 in mice. J Exp Med 214, 3279-3292 (2017).

      (27) Fremond, M. L. et al. Overview of STING-Associated Vasculopathy with Onset in Infancy (SAVI) Among 21 Patients. J Allergy Clin Immunol Pract 9, 803-818 e811 (2021).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #3 (Public Review):

      The iron manipulation experiments are in the whole animal and it is likely that this affects general feeding behaviour, which is known to affect NB exit from quiescence and proliferative capacity. The loss of ferritin in the gut and iron chelators enhancing the NB phenotype are used as evidence that glia provide iron to NB to support their number and proliferation. Since the loss of NB is a phenotype that could result from many possible underlying causes (including low nutrition), this specific conclusion is one of many possibilities.

      We have investigated the feeding behavior of fly by Brilliant Blue (sigma, 861146)[1]. Our result showed that the amount of dye in the fly body were similar between control group and BPS group, suggesting that BPS almost did not affect the feeding behavior (Figure 3—figure supplement 1A).

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      There was a gap between the Pros nuclear localization and downstream targets of ferritin, particularly NADH dehydrogenase and biosynthesis. Could overexpression of Ndi1 restore Pros localization in NBs?

      Ferritin defect downregulates iron level, which leads to cell cycle arrest of NBs via ATP shortage. And cell cycle arrest of NBs probably results in NB differentiation[2, 3]. We have added the experiment in Figure 5—figure supplement 2. This result showed that overexpression of Ndi1 could significantly restore Pros localization in NBs.

      The abstract requires revision to cover the major findings of the manuscript, particularly the second half.

      We revised the abstract to add more major findings of the manuscript in the second half as follows:

      “Abstract

      Stem cell niche is critical for regulating the behavior of stem cells. Drosophila neural stem cells (Neuroblasts, NBs) are encased by glial niche cells closely, but it still remains unclear whether glial niche cells can regulate the self-renewal and differentiation of NBs. Here we show that ferritin produced by glia, cooperates with Zip13 to transport iron into NBs for the energy production, which is essential to the self-renewal and proliferation of NBs. The knockdown of glial ferritin encoding genes causes energy shortage in NBs via downregulating aconitase activity and NAD+ level, which leads to the low proliferation and premature differentiation of NBs mediated by Prospero entering nuclei. More importantly, ferritin is a potential target for tumor suppression. In addition, the level of glial ferritin production is affected by the status of NBs, establishing a bicellular iron homeostasis. In this study, we demonstrate that glial cells are indispensable to maintain the self-renewal of NBs, unveiling a novel role of the NB glial niche during brain development.”

      In Figure 2B Mira appeared to be nuclear in NBs, which is inconsistent with its normal localization. Was it Dpn by mistake?

      In Figure 2B, we confirmed that it is Mira. Moreover, we also provide a magnified picture in Figure 2B’, showing that the Mira mainly localizes to the cortex or in the cytoplasm as previously reported.

      Figure 2C, Fer1HCH-GFP/mCherry localization was non-uniform in the NBs revealing 1-2 regions devoid of protein localization potentially corresponding to the nucleus and Mira crescent enrichment. It is important to co-label the nucleus in these cells and discuss the intracellular localization pattern of Ferritin.

      We have revised the picture with nuclear marker DAPI in Figure 2C. The result showed that Fer1HCH-GFP/Fer2LCH-mCherry was not co-localized with DAPI, which indicated that Drosophila ferritin predominantly distributes in the cytosol[4, 5]. As for the concern mentioned by this reviewer, GFP/mCherry signal in NBs was from glial overexpressed ferritin, which probably resulted in non-uniform signal.

      In Figure 3-figure supplement 3F, glial cells in Fer1HCH RNAi appeared to be smaller in size. This should be quantified. Given the significance of ferritin in cortex glial cells, examining the morphology of cortex glial cells is essential.

      In Figure 3—figure supplement 3F, we did not label single glial cells so it was difficult to determine whether the size was changed. However, it seems that the chamber formed by the cellular processes of glial cells becomes smaller in Fer1HCH RNAi. The glial chamber will undergo remodeling during neurogenesis, which responses to NB signal to enclose the NB and its progeny[6]. Thus, the size of glial chamber is regulated by NB lineage size. In our study, ferritin defect leads to the low proliferation, inducing the smaller lineage of each NB, which likely makes the chamber smaller.

      Since the authors showed that the reduced NB number was not due to apoptosis, a time-course experiment for glial ferritin KD is recommended to identify the earliest stage when the phenotype in NB number /proliferation manifests during larval brain development.

      We observed brains at different larval stages upon glial ferritin KD. The result showed that NB proliferation decreased significantly, but NB number declined slightly at the second-instar larval stage (Figure 5—figure supplement 1E and F), suggesting that brain defect of glial ferritin KD manifests at the second-instar larval stage.

      Transcriptome analysis on ferritin glial KD identified genes in mitochondrial functions, while the in vivo EM data suggested no defects in mitochondria morphology. A short discussion on the inconsistency is required.

      For the observation of mitochondria morphology via the in vivo EM data, we focused on visible cristae in mitochondria, which was used to determine whether the ferroptosis happens[7]. It is possible that other details of mitochondria morphology were changed, but we did not focus on that. To describe this result more accurately, we replaced “However, our observation revealed no discernible defects in the mitochondria of NBs after glial ferritin knockdown” with the “However, our result showed that the mitochondrial double membrane and cristae were clearly visible whether in the control group or glial ferritin knockdown group, which suggested that ferroptosis was not the main cause of NB loss upon glial ferritin knockdown” in line 207-209.

      The statement “we found no obvious defects of brain at the first-instar larval stage (0-4 hours after larval hatching) when knocking down glial ferritin (Figure 5-figure supplement 1C).” lacks quantification of NB number and proliferation, making it challenging to conclude.

      We have provided the quantification of NB number and proliferation rate of the first-instar larval stage in Figure 5—figure supplement 1C and D. The data showed that there is no significant change in NB number and proliferation rate when knocking down ferritin, suggesting that no brain defect manifests at the first-instar larval stage.

      A wild-type control is necessary for Figure 6A-C as a reference for normal brain sizes.

      We have added Insc>mCherry RNAi as a reference in Figure 6A-D, which showed that the brain size of tumor model is larger than normal brain. Moreover, we removed brat RNAi data from Figure 6A-D to Figure 6—figure supplement 1A-D for the better layout.

      In Figures 6B, D, “Tumor size” should be corrected to “Larval brain volume”.

      Here, we measured the brain area to assess the severity of the tumor via ImageJ instead of 3D data of the brain volume. So we think it would be more appropriate to use the “Larval brain size” than “Larval brain volume” here. Thus, we have corrected “Tumor size” to “Larval brain size” in Figure 6B and D to Figure 6—figure supplement 1B and D.

      Considering that asymmetric division defects in NBs may lead to premature differentiation, it is advisable to explore the potential involvement of ferritin in asymmetric division.

      aPKC is a classic marker to determine the asymmetric division defect of NB. We performed the aPKC staining and found it displayed a crescent at the apical cortex based on the daughter cell position whether in control or glial ferritin knockdown (Figure 5—figure supplement 3A). This result indicated that there was no obvious asymmetric defect after glial ferritin knockdown.

      In the statement "Secondly, we examined the apoptosis in glial cells via Caspase-3 or TUNEL staining, and found the apoptotic signal remained unchanged after glial ferritin knockdown (Figure 3-figure supplement 3A-D).", replace "the apoptosis in glial cells" with "the apoptosis in larval brain cells".

      We have replaced "the apoptosis in glial cells" with "the apoptosis in larval brain cells" in line 216.

      Include a discussion on the involvement of ferritin in mammalian brain development and address the limitations associated with considering ferritin as a potential target for tumor suppression.

      We have added the discussion about ferritin in mammalian brain development in line 428-430 and limitation of ferritin for suppressing tumor in line 441-444.

      Indicate Insc-GAL4 as BDSC#8751, even if obtained from another source. Additionally, provide information on the extensively used DeRed fly stock used in this study within the methods section.

      We provided the stock information of Insc-GAL4 and DsRed in line 673-674.

      Reviewer #2 (Recommendations For The Authors):

      Major points:

      The number of NBs differs a lot between experiments. For example, in Fig 1B and 1K controls present less than 100 NBs whereas in Figure 1 Supplementary 2B it can be seen that controls have more than 150. Then, depending on which control you compare the number of NBs in flies silencing Fer1HCH or Fer2LCH, the results might change. The authors should explain this.

      Figure 1 Supplementary 2B (Figure 1 Supplementary 3B in the revised version) shows NB number in VNC region while Fig 1B and 1K show NB number in CB region. At first, we described the general phenotype showing the NB number in CB and VNC respectively (Fig 1 and Fig 1-Supplementary 1 and 3 in the revised version). And the NB number is consistent in each region. After then, we focused on NB number in CB for the convenience.

      This reviewer encourages the authors to use better Gal4 lines to describe the expression patterns of ferritins and Zip13 in the developing brain. On the one hand, the authors do not state which lines they are using (including supplementary table). On the other hand, new Trojan GAL4 (or at least InSite GAL4) lines are a much better tool than classic enhancer trap lines. The authors should perform this experiment.

      All stock source and number were documented in Table 2. Ferritin GAL4 and Zip13 GAL4 in this study are InSite GAL4. In addition, we also used another Fer2LCH enhancer trapped GAL4 to verify our result (DGRC104255) and provided the result in Figure 2—figure supplement 1. Our data showed that DsRed driven by Fer2LCH-GAL4 was co-localized with the glia nuclear protein Repo, instead of the NB nuclear protein Dpn, which was consistent with the result of Fer1HCH/Fer2LCH GAL4. In addition, we will try to obtain the Trojan GAL4 (Fer1HCH/Fer2LCH GAL4 and Zip13 GAL4) and validate this result in the future.

      The authors exclude very rapidly the possibility of ferroptosis based only on some mitochondrial morphological features without analysing the other hallmarks of this iron-driven cell death. The authors should at least measure Lipid Peroxidation levels in their experimental scenario either by a kit to quantify by-products of lipid peroxidation such as Malonaldehide (MDA) or using an anti 4-HNE antibody.

      We combined multiple experiments to exclude the possibility of ferroptosis. Firstly, ferroptosis can be terminated by iron chelator. And we fed fly with iron chelator upon glial ferritin knockdown, but NB number and proliferation were not restored, which suggested that ferroptosis probably was not the cause of NB loss induced by glial ferritin knockdown (Figure 3B and C). Secondly, Zip13 transports iron into the secretary pathway and further out of the cells in Drosophila gut[8]. Our data showed that knocking down iron transporter Zip13 in glia resulted in the decline of NB number and proliferation, which was consistent with the phenotype upon glial ferritin knockdown (Figure 3E-G). More importantly, the knockdown of Zip13 and ferritin simultaneously aggravated the phenotype in NB number and proliferation (Figure 3E-G). These results suggested that the phenotype was induced by iron deficiency in NB, which excluded the possibility of iron overload or ferroptosis to be the main cause of NB loss upon glial ferritin knockdown. Finally, we observed mitochondrial morphology on double membrane and the cristae that are critical hallmarks of ferroptosis, but found no significant damage (Figure 3-figure supplement 2E and F).

      In addition, we have added the 4-HNE determination in Figure 3—figure supplement 2G and H. This result showed that 4-HNE level did not change significantly, suggesting that lipid peroxidation was stable, which supported to exclude the possibility that the ferroptosis led to the NB loss upon glial ferritin knockdown.

      All of the above results together indicate that ferroptosis is not the cause of NB loss after ferritin knockdown.

      A major flaw of the manuscript is related to the chapter Glial ferritin defects result in impaired Fe-S cluster activity and ATP production and the results displayed in Figure 4. The authors talk about the importance of FeS clusters for energy production in the mitochondria. Surprisingly, the authors do not analyse the genes involved in this process such as but they present the interaction with the cytosolic FeS machinery that has a role in some extramitochondrial proteins but no role in the synthesis of FeS clusters incorporated in the enzymes of the TCA cycle and the respiratory chain. The authors should repeat the experiments incorporating the genes NSF1 (CG12264), ISCU(CG9836), ISD11 (CG3717), and fh (CG8971) or remove (or at least rewrite) this entire section.

      Thanks for this constructive advice and we have revised this in Figure 4B and C. We repeated the experiment with blocking mitochondrial Fe-S cluster biosynthesis by knocking down Nfs1 (CG12264), ISCU(CG9836), ISD11 (CG3717), and fh (CG8971), respectively. Nfs1 knockdown in NB led to a low proliferation, which was consistent with CIA knockdown. However, we did not observe the obvious brain defect in ISCU(CG9836), ISD11 (CG3717), and fh (CG8971) knockdown in NB. Our interpretation of these results is that Nfs1 probably is a necessary core component in Fe-S cluster assembly while others are dispensable[9].

      The presence and aim of the mouse model Is unclear to this reviewer. On the one hand, It Is not used to corroborate the fly findings regarding iron needs from neuroblasts. On the other hand, and without further explanation, authors migrate from a fly tumor model based on modifying all neuroblasts to a mammalian model based exclusively on a glioma. The authors should clarify those issues.

      Although iron transporter probably is different in Drosophila and mammal, iron function is conserved as an essential nutrient for cell growth and proliferation from Drosophila to mammal. The data of fly suggested that iron is critical for brain tumor growth and thus we verified this in mammalian model. Glioma is the most common form of central nervous system neoplasm that originates from neuroglial stem or progenitor cells[10]. Therefore, we validated the effect of iron chelator DFP on glioma in mice and found that DFP could suppress the glioma growth and further prolong the survival of tumor-bearing mice.

      Minor points

      Although referred to adult flies, the authors did not include either in the introduction or in the discussion existing literature about expression of ferritins in glia or alterations of iron metabolism in fly glia cells (PMID: 21440626 and 25841783, respectively) or usage of the iron chelator DFP in drosophila (PMID: 23542074). The author should check these manuscripts and consider the possibility of incorporating them into their manuscript.

      Thanks for your remind. We have incorporated all recommended papers into our manuscript line 65-67 and 168.

      The number of experiments in each figure is missing.

      All experiments were repeated at least three times. And we revised this in Quantifications and Statistical Analysis of Materials and methods.

      If graphs are expressed as mean +/- sem, it is difficult to understand the significance stated by the authors in Figure 2E.

      We apologize for this mistake and have revised this in Quantifications and Statistical Analysis. All statistical results were presented as means ± SD.

      When authors measure aconitase activity, are they measuring all (cytosolic and mitochondrial) or only one of them? This is important to better understand the experiments done by the authors to describe any mitochondrial contribution (see above in major points).

      In this experiment, we were measuring the total aconitase activity. We also tried to determine mitochondrial aconitase but it failed, which was possibly ascribed to low biomass of tissue sample.

      In this line, why do controls in aconitase and atp lack an error bar? Are the statistical tests applied the correct ones? It is not the same to have paired or unpaired observations.

      It is the normalization. We repeated these experiments at least three times in different weeks respectively, because the whole process was time-consuming and energy-consuming including the collection of brains, protein determination and ATP or aconitase determination. And the efficiency of aconitase or ATP kit changed with time. We cannot control the experiment condition identically in different batches. Therefore, we performed normalization every time to present the more accurate result. The control group was normalized as 1 via dividing into itself and other groups were divided by the control. This normalized process was repeated three times. Therefore, there is no error bar in the control group. We think it is appropriate to apply ANOVA with a Bonferroni test in the three groups.

      In some cases, further rescue experiments would be appreciated. For example, expression of Ndi restores control NAD+ levels or number of NBs, it would be interesting to know if this is accompanied by restoring mitochondrial integrity and its ability to produce ATP.

      We have determined ATP production after overexpressing Ndi1 and provided this result in Figure 4—figure supplement 1B. The data showed that expression of Ndi1 could restore ATP production upon glial Fer2LCH knockdown, which was consistent with our conclusion.

      Lines 293-299 on page 7 are difficult to understand.

      According to our above results, the decrease of NB number and proliferation upon glial ferritin knockdown (KD) was caused by energy deficiency. As shown in the schematic diagram (Author response image 1), “T” represented the total energy which was used for NB maintenance and proliferation. “N” indicated the energy for maintaining NB number. “P” indicated the energy for NB proliferation. “T” is equal to “N” plus “P”. When ferritin was knocked down in glia, “T”, “N” and “P” declined in “Ferritin KD” compared to “wildtype (WT)”. Knockdown of pros can prevent the differentiation of NB, but it cannot supply the energy for NB, which probably results in the rescue of NB number but not proliferation. Specifically, NB number increased significantly in “Ferritin KD Pros KD” compared to “Ferritin KD”, which resulted in consuming more energy for NB maintenance in “Ferritin KD Pros KD”. As shown in the schematic diagram, “T” was not changed between “Ferritin KD Pros KD” and “Ferritin KD”, whereas ”N” was increased in “Ferritin KD Pros KD” compared to “Ferritin KD”. Thus, “P” was decreased, which suggested that less energy was remained for proliferation, leading to the failure of rescue in NB proliferation. It seemed that the level of proliferation in “Ferritin KD Pros KD” was even lower than “Ferritin KD”.

      Author response image 1.

      The schematic diagram of relationship between energy and NB function in different groups. “T” represents total energy for NB maintenance and proliferation. “N” represents the energy for NB maintenance. “P” represents the energy for NB proliferation. T=N+P 

      Line 601 should indicate that Tables 2 and 3 are part of the supplementary material.

      We have revised this in line 678.

      Figure 4-supplement 1. Only validation of 2 genes from a RNAseq seems too little.

      We dissected hundreds of brains for sorting NBs because of low biomass of fly brain. This is a difficult and energy-consuming work. Most NBs were used for RNA-seq, so we can only use a small amount of sample left for validation which is not enough for more genes.

      Figure 6E, the authors indicate that 10 mg/ml DFP injection could significantly prolong the survival time. Which increase in % is produced by DFP?

      We have provided the bar graph in Author response image 2. The increase is about 16.67% by DFP injection.

      Author response image 2.

      The bar graph of survival time of mice treated with DFP.

      (The unpaired two-sided Student’s t test was employed to assess statistical significance. Statistical results were presented as means ± SD. n=7,6; *: p<0.05)

      Reviewer #3 (Recommendations For The Authors):

      As I read the initial results that built the story (glia make ferritin>release it> NBs take them up>use it for TCA and ETC) I kept thinking about what it meant for NBs to be 'lost'. This led me to consider alternate possibilities that the results might point to, other than the ones the authors were suggesting. It was only in Figure 5 that the authors ruled out some of those possibilities. I would suggest that they first illustrate how NBs are lost upon glial ferritin loss of function before they delve into the mechanism. This would also be a place to similarly address that glial numbers and general morphology are unchanged upon ferritin loss.

      This recommendation provides a valuable guideline to build this story especially for researchers who are interested in neural stem cell studies. Actually, we tried this logic to present our study but found that there are several gaps in the middle of the manuscript, such as the relationship between glial ferritin and Pros localization in NB, so that the whole story cannot be fluently presented. Therefore, we decided to present this study in the current way.

      More details of the screen would be useful to know. How many lines did they screen, what was the assay? This is not mentioned anywhere in the text.

      We have added this in Screen of Materials and methods. We screened about 200 lines which are components of classical signaling pathways, highly expressed genes in glial cells or secretory protein encoding genes. UAS-RNAi lines were crossed with repo-Gal4, and then third-instar larvae of F1 were dissected. We got the brains from F1 larvae and performed immunostaining with Dpn and PH3. Finally, we observed the brain in Confocal Microscope.

      Many graphs seem to be repeated in the main figures and the supplementary data. This is unnecessary, or at least should be mentioned.

      We appreciate your kind reminder. However, we carefully went through all the figures and did not find the repeated graphs, though some of them look similar.

      The authors mention that they tested which glial subtypes ferritin is needed in, but don't show the data. Could they please show the data? Same with the other iron transport/storage/regulation. Also, in both this and later sections, the authors could mention which Gal4 was used to label what cell types. The assumption is that the reader will know this information.

      We have added the result of ferritin knockdown in glial subpopulations in Figure 1—figure supplement 2. However, considering that the quantity of iron-related genes, we did not take the picture, but we recorded this in Table 3.

      For all their images showing colocalisation, magnified, single-colour images shown in grayscale will be useful. For example, without the magnification, it is not possible to see the NB expression of the protein trap line in Figure 2B. A magnified crop of a few NBs (not a single one like in 2C) would be more useful.

      We have provided Figure 2A’, B’, D’ and Figure 3D’ as suggested.

      There are a lot of very specific assays used to detect ROS, NAD, aconitase activity, among others. It would be nice to have a brief but clear description of how they work in the main text. I found myself having to refer to other sources to understand them. (I believe SoNAR should be attributed to Zhao et al 206 and not Bonnay et al 2020.)

      We have added a brief description about ROS, aconitase activity, NAD in line 198-199, 229-231, and 269 as suggested.

      I did not understand the normalisation done with respect to SoNAR. Is this standard practice? Is the assumption that 'overall protein levels will be higher in slowly proliferating NBs' reasonable? This is why they state the need to normalise.

      The SoNAR normalization is not a standard practice. However, we think that our normalization of SoNar is reasonable. According to our results, the expression level of Dpn and Mira seemed higher in glial ferritin knockdown, so we speculated that some proteins accumulated in slowly proliferating NBs. Thus, we used Insc-GAL4 to drive DsRed for indicating the expression level of Insc and found that DsRed rose after glial ferritin knockdown, suggesting that Insc expression was increased indeed. Therefore, we have to normalize SoNar driven by Insc-GAL4 based on DsRed driven by Insc-Gal4, which eliminates the effect of increased Insc upon glial ferritin knockdown.

      FAC is mentioned as a chelator? But the authors seem to use it oppositely. Is there an error?

      FAC is a type of iron salt, which is used to supply iron. We have also indicated that in line 156 according to your advice. 

      The lack of any cell death in the L3 brain surprised me. There should be plenty of hemilineages that die, as do many NBs, particularly in the abdominal segments. Is the stain working? Related to this, P35 is not the best method for rescuing cell death. H99 might be a better way to go.

      We were also surprised to see this result and repeated this experiment for several times with both negative and positive controls. Moreover, we also used TUNEL to validate this result, which led to the same result. We will try to use H99 to rescue NB loss in the future, because it needs to be integrated and recombined with our current genetic tools.

      It would be nice to see the aconitase activity signal as opposed to just the quantification.

      This method can only determine the absorbance for indicating aconitase activity, so our result is just the quantification.

      Glia are born after NBs are specified. In fact, they arise from NBs (and glioblasts). So, it's unlikely that the knockdown of ferritin in glia can at all affect initial NB specification.

      We completely agree with this statement.

      The section on tumor suppression seems out of place. The fly data on which the authors base this as an angle to chase is weak. Dividing cells will be impaired if they have inadequate energy production. As a therapeutic, this will affect every cell in the body. I'm not sure that cancer therapeutics is pursuing such broadly acting lines of therapies anymore.

      Our data suggested that iron/ferritin is more critical for high proliferative cells. Tumor cells have a high expression of TfR (Transferrin Receptor)[11], which can bind to Transferrin and ferritin[12]. And ferritin specifically targets on the tumor cells[11]. Thus, we think iron/ferritin is extremely essential for tumor cells. If we can find the appropriate dose of iron/ferritin inhibitor, suppressing tumor growth but maintaining normal cell growth, iron/ferritin might be an effective target of tumor treatment.

      The feedback from NB to glial ferritin is also weak data. The increased cell numbers (of unknown identity) could well be contributing to the increase in ferritin. I would omit the last two sections from the MS.

      In brat RNAi and numb RNAi, increased cells are NB-like cells, which cannot undergo further differentiation and are not expected to produce ferritin. More importantly, we used Repo (glia marker) as the reference and quantified the ratio of ferritin level to Repo level, which can exclude the possibility that increased glial cells lead to the increase in ferritin.

      References

      (1) Tanimura T, Isono K, Takamura T, et al. Genetic Dimorphism in the Taste Sensitivity to Trehalose in Drosophila-Melanogaster. J Comp Physiol, 1982,147(4):433-7

      (2) Myster DL, Duronio RJ. Cell cycle: To differentiate or not to differentiate? Current Biology, 2000,10(8):R302-R4

      (3) Dalton S. Linking the Cell Cycle to Cell Fate Decisions. Trends in Cell Biology, 2015,25(10):592-600

      (4) Nichol H, Law JH, Winzerling JJ. Iron metabolism in insects. Annu Rev Entomol, 2002,47:535-59

      (5) Pham DQ, Winzerling JJ. Insect ferritins: Typical or atypical? Biochim Biophys Acta, 2010,1800(8):824-33

      (6) Speder P, Brand AH. Systemic and local cues drive neural stem cell niche remodelling during neurogenesis in Drosophila. Elife, 2018,7

      (7) Mumbauer S, Pascual J, Kolotuev I, et al. Ferritin heavy chain protects the developing wing from reactive oxygen species and ferroptosis. PLoS Genet, 2019,15(9):e1008396

      (8) Xiao G, Wan Z, Fan Q, et al. The metal transporter ZIP13 supplies iron into the secretory pathway in Drosophila melanogaster. Elife, 2014,3:e03191

      (9) Marelja Z, Leimkühler S, Missirlis F. Iron Sulfur and Molybdenum Cofactor Enzymes Regulate the  Life Cycle by Controlling Cell Metabolism. Front Physiol, 2018,9

      (10) Morgan LL. The epidemiology of glioma in adults: a "state of the science" review. Neuro-Oncology, 2015,17(4):623-4

      (11) Fan K, Cao C, Pan Y, et al. Magnetoferritin nanoparticles for targeting and visualizing tumour tissues. Nat Nanotechnol, 2012,7(7):459-64

      (12) Li L, Fang CJ, Ryan JC, et al. Binding and uptake of H-ferritin are mediated by human transferrin receptor-1. Proc Natl Acad Sci U S A, 2010,107(8):3505-10

    2. eLife assessment

      This valuable study, which seeks to identify factors from the glial niche that support and maintain neural stem cells, reports a novel role for ferritin in this process. The authors provide solid evidence that defects in larval brain development in Drosophila, resulting from ferritin knockdown, can be attributed to impaired Fe-S cluster activity and ATP production. The findings of this well-conducted study will be of interest to oncologists and neurobiologists.

    3. Reviewer #1 (Public Review):

      This study unveils a novel role for ferritin in Drosophila larval brain development. Furthermore, it pinpoints that the observed defects in larval brain development resulting from ferritin knockdown are attributed to impaired Fe-S cluster activity and ATP production. Overall this is a well-conducted and novel study.

      The author have adequately addressed the concerns.

    4. Reviewer #2 (Public Review):

      Summary:

      Zhixin and collaborators have investigated if the molecular pathways present in glia play a role in the proliferation, maintenance and differentiation of Neural Stem Cells. In this case, Drosophila Neuroblasts are used as models. Authors find that neuronal iron metabolism modulated by glial ferritin is an essential element for Neuroblast proliferation and differentiation. They show that loss of glial ferritin is sufficient to impact the number of neuroblasts. Remarkably, authors have identified that ferritin produced in the glia is secreted to be used as an iron source by the neurons. Therefore iron defects in glia have serious consequences in neuroblasts and likely vice versa. Interestingly, preventing iron absorption in the intestine is sufficient to reduce NB number. Furthermore, they have identified Zip13 as another regulator of the process. Evidence presented strongly indicates that the loss of neuroblasts is due to premature differentiation rather than cell death.

      Strengths:

      - Comprehensive analysis of the impact of glial iron metabolism in neuroblast behaviour by genetic and drug-based approaches as well as using a second model (mouse) for some validations.

      - Using cutting edge methods such as RNAseq as well as very elegant and clean approaches such as RNAi-resistant lines or temperature-sensitive tools

      - Goes beyond the state of the art highlighting iron as a key element in neuroblast formation as well as as a target in tumor treatments.

      Comments on latest version:

      The authors have successfully and convincingly addressed all comments from this reviewer. The modifications, changes and additions have increased the robustness of the results and clearly increased the readability of the manuscript.

      This reviewer also appreciates all the efforts and extra work conducted by the authors to finish in a reasonable time all the experiments suggested by all reviewers.

    1. eLife assessment

      This valuable study asks how Promyelocytic leukemia protein (PML) becomes associated with the nucleoli of cells (PML Nucleolar Associations, PNAs) upon various genotoxic stimuli. Using immunostaining analysis with induced DNA double-strand breaks (DSBs) in rDNA repeats, the authors provide solid evidence that PNAs are triggered mostly by the inhibition of topoisomerase and RNA polymerase I, which is augmented by homologous recombination but not by the non-homologous end joining double-strand break repair pathway. The findings have potential implications for a better understanding of how DNA damage in ribosomal DNA is repaired for genome stability. This paper is of interest to researchers in the fields of nuclear structure and DNA repair.

    2. Summary:

      This paper described the dynamics of the nuclear substructure called PML Nucleolar Association (PNA) in response to DNA damage on ribosomal DNA (rDNA) repeats. The authors showed that the PNA with rDNA repeats is induced by the inhibition of topoisomerases and RNA polymerase I and that the PNA formation is modulated by RAD51, thus homologous recombination. Artificially induced DNA double-strand breaks (DSBs) in rDNA repeats stimulate the formation of PNA with DSB markers. This DSB-triggered PNA formation is regulated by DSB repair pathways.

      Strengths:

      This paper illustrates a unique DNA damage-induced sub-nuclear structure containing the PML body, which is specifically associated with the nucleolus. Moreover, the dynamics of this PML Nucleolar Association (PNA) require topoisomerases and RNA polymerase I and are modulated by RAD51-mediated homologous recombination and non-homologous end-joining. This study provides a unique regulation of DSB repair at rDNA repeats associated with the unique-membrane-less subnuclear structure.

      Weaknesses:

      Although the PNA formation on rDNA repeat is nicely shown by cytological analysis, the biological significance of PNA in DSB repair is not fully addressed.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      This paper described the dynamics of the nuclear substructure called PML Nucleolar Association

      (PNA) in response to DNA damage on ribosomal DNA (rDNA) repeats. The authors showed that the PNA with rDNA repeats is induced by the inhibition of topoisomerases and RNA polymerase I and that the PNA formation is modulated by RAD51, thus homologous recombination. Artificially induced DNA double-strand breaks (DSBs) in rDNA repeats stimulate the formation of PNA with DSB markers. This DSB-triggered PNA formation is regulated by DSB repair pathways. 

      Strengths: 

      This paper illustrates a unique DNA damage-induced sub-nuclear structure containing the PML body, which is specifically associated with the nucleolus. Moreover, the dynamics of this PML Nucleolar Association (PNA) require topoisomerases and RNA polymerase I and are modulated by RAD51mediated homologous recombination and non-homologous end-joining. This study provides a unique regulation of DSB repair at rDNA repeats associated with the unique-membrane-less subnuclear structure. 

      Weaknesses: 

      Although the PNA formation on rDNA repeat is nicely shown by cytological analysis, the biological significance of PNA in DSB repair is not fully addressed.

      We appreciate the succinct summary, and thank you for pointing out this insightful comment. Our data show that the dynamic interaction of PML with nucleolar caps can recognize and sequester damaged rDNA from the reactivated nucleolus. We propose that through this process, the actively transcribed intact rDNA is protected from possible detrimental interaction with the defective, PNAs-sequestered rDNA, most likely to avoid the harmful intra- and inter-chromosomal recombination events that would otherwise likely occur during recombinational repair of the damaged rDNA, as the rDNA repeats present on five chromosomes are highly repetitive. Thus, this novel sorting mechanism might help sustain the integrity of repetitive rDNA loci.

      Our data also indicate that the emergence of PNAs coincided with cell cycle arrest and preceded the establishment of cellular senescence. The senescent response to rDNA damage can primarily protect the genome from the instability of rDNA loci in a manner broadly analogous to that described for protecting the telomeric loci. This notion is supported by the lack of PNA formation in most cancer cells. In the broader context of the biological significance of cellular senescence at the organismal level, such robust response to hazardous rDNA damage in the individual affected cells may limit/prevent the sporadic occurrence of early cancerous lesions, at the expense of potential tissue adverse effects accumulating over time and thereby eventually contributing to organismal aging.

      Reviewer #2 (Public Review): 

      In this manuscript, the authors aim to study the PML-nucleoli association (PNAs) by different genotoxic stress and to determine the underlying molecular mechanisms. 

      First, from a diverse set of genotoxic stress conditions (topoisomerases, RNA Pol I, rRNA processing, and DNA replication stress), the authors have found that the inhibition of topoisomerases and RNA Polymerase I has the highest PNA formation associated with p53 stabilization, gamma-H2AX, and PAF49 segregation. It was further demonstrated that Rad51-mediated HR pathway but not NHEJ pathway is associated with the PNA formation. Immuno-FISH assays show that doxorubicin induces DSBs (53BP1 foci) in rDNA and PNA interactions with rDNA/DJ regions. Furthermore, endonuclease IPpol induced DSB at a defined location in rDNA and led to PNAs. 

      Most claims by the authors are supported by the data provided. However, below weaknesses/concerns may need to be addressed to improve the quality of the study. 

      (1) Top2B toxin doxorubicin had the highest degree of elevating PNAs; however, Top2B-knockdown had almost no noticeable effects on PNAs. How to reconcile the different phenotypes targeting Top2B? 

      We thank the reviewer for this comment and believe we can reconcile the results from doxorubicin treatments and the downregulation of TOP2A and B. 

      The different phenotypes can reflect the fact that doxorubicin targets both human TOP2 isoforms: TOP2A and TOP2B. Hence this treatment can limit any potential redundant roles of the individual topoisomerase subtypes, which, on the other hand, can be manifested under conditions when only one specific member is depleted genetically. On the other hand, it is also crucial to note that these isoforms are not fully functionally redundant. Each isoform reveals a characteristic expression pattern and distinct yet overlapping function (e.g. Nitiss J 2009, doi.org/10.1038/nrc2608, or Uusküla-Reimand 10.1126/sciadv.add4920). Thus, doxorubicin treatment or TOP2A KD can, contrary to TOP2B KD, trigger the formation of PNAs.   

      Additionally, besides topoisomerase inhibition and poisoning, doxorubicin intercalates DNA and elevates oxidative stress. Therefore, the observed effect of doxorubicin may also reflect, to some extent, its broader damaging impact on (r)DNA. On the other hand, the downregulation of individual topoisomerase isoforms shows how the restriction of their respective specific function/s may evoke (r)DNA damage.

      (2) To test the role of Rad51 and DNA-PKcs in the PNA formation, Rad51 inhibitor B02 and DNA-PKcs inhibitor NU-7441 were chosen to use in the study. To further exclude the possible off-target of B02 and NU-7441, siRNA-mediated knockdown of Rad51 and DNA-PKcs would be an appropriate complementary approach to the pharmaceutical inhibitor approach. 

      We followed this stimulating suggestion, and in the revised manuscript, we used pools of siRNAs (esiRNA) to target the mRNA of RAD51 or ligase IV (LIG4) -  to mimic the Rad51 chemical inhibitor B02 and the NHEJ (DNA PK) inhibitor NU-7441, respectively. The relevant new data are presented in Figure 5F-I, 6E, and F, Supplementary Figure 5D, E, F – H, and Supplementary Figure 6C-E. Notably, the results of rDNA damage triggered PNAs formation obtained using the chemical inhibition of the repair pathways and the genetic approach (knockdown), were largely consistent, thereby supporting our original conclusions. There was one interesting partial difference when the B02 RAD51 inhibitor was compared with RAD51 knockdown, which we also comment on below, and suggest a plausible explanation reflecting the fact (known for other DDR proteins such as PARP1, etc.) that the functional inhibition of an expressed protein (here RAD51, by B02) may not necessarily phenotypically recapitulate the absence of such protein (here RAD51 knockdown). Overall, we agree that this was a very important set of control experiments, in addition extended to cell cycle phase analysis.

      First, the LIG4 knockdown impacted the I-PpoI-induced PNAs formation in a way that followed the same trend as the effects caused by the NHEJ pathway inhibitor NU-7441, namely increased frequency of PNAs formation when NHEJ was impaired (Figure 5E a 5I). This was expected based on what we know about the PNA formation, as the NHEJ pathway is active throughout the cell cycle, and when such repair mode is not available in the nucleolus, then more rDNA breaks remain unrepaired and must be transported to the nucleolar caps to be processed by the HR pathway, thereby also leading to more PNAs structures formed under such conditions. In terms of cell cycle phases, the observed increase of I-PpoI-induced PNAs in cells with depleted LIG4 was more pronounced in S/G2 cells, when the PNAspromoting, cap-associated HR pathway is more active. Furthermore, the enhanced occurrence of IPpoI-induced PNAs in cells depleted of LIG4 was counter-acted (partly ‘rescued/prevented’) by the concomitant treatment with the RAD51 inhibitor B02 (Figure 5E and I) compare cells with esiLIG4 alone versus esiLIG4 + B02), overall consistent with the notion that cap-associated HR pathway facilitates PNAs formation.

      Second, in the analogous scenario of comparing the impact of the RAD51 chemical inhibitor (B02) with the siRNA-mediated knockdown of RAD51, the observed trends in terms of the resulting frequencies of I-PpoI-induced PNAs, were also largely consistent, in that both strategies of interfering with RAD51 resulted in fewer PNAs formed than than in cells deficient in NHEJ. On the other hand, we must stress that after RAD51 knockdown, we did not observe a decline of PNAs compared to control cells, which was detected after B02 treatment (Figure 5E and I).  However, when specifically considering the cell cycle position of the individual cells, these new analyses revealed again important similarities between the knockdown and chemical inhibition of RAD51 (Figure 6E, Supplementary Figure 6E).

      Before discussing the partial, cell-cycle-related difference between the impact of RAD51 chemical inhibition vs. knockdown, it is important to consider the PNAs patterns seen in cells with activated IPpoI and proficient in both, NHEJ and HR. Thus, the overall frequency of I-PpoI-induced PNAs formation was higher in G1 than in S/G2 cells. Considering that persistent rDNA DSBs trigger the formation of PNAs, this result may reflect the very limited HDR during G1 phase, in contrast to more efficient repair of I-PpoI-induced rDNA DSBs in S/G2, the cell cycle phase in which the activity of both NHEJ and HDR operate in parallel, the latter pathway offering a safer, error-free mechanism of DSB repair.

      Notably, when comparing the PNAs formation frequency in cells treated with either chemical inhibition of RAD51 (with B02) or upon knockdown of RAD51, we strikingly observed that the decrease of I-PpoIinduced PNAs formation upon RAD51 knockdown was apparent only for cells in G1 (Figure 6E, and Supplementary Figure 6E). We believe that the distinct impact of RAD51 knockdown compared with that of RAD51 inhibitor (mainly seen when S/G2 cells were analyzed separately) might reflect one or a combination of several factors, including e.g. the following: 

      i) The knock-down-induced absence of RAD51 protein may allow access to the persistent DSB lesions by other alternative repair proteins (such as the RAD52-mediated repair reported in diverse pathophysiological circumstances including in cells undergoing senescence, a scenario very relevant for our present study). Such altered stoichiometry of proteins interacting with the persistent rDNA DSBs may contribute to the pattern of PNAs formation that is then distinct from the pattern seen in the presence of  Rad51; 

      ii) Another difference that we observe is the somewhat enhanced frequency of ‘spontaneous’ (i.e., even without activating the I-PpoI) PNAs formation when RAD51 is depleted, a phenomenon not seen when control non-targeting siRNA is transfected or when RAD51 is acutely inhibited by B02 (Figure 5H). Such spontaneous baseline PNA formation likely reflects the enhanced persistence of unrepaired endogenously occurring DNA lesions that are already suboptimally processed during the period following the esiRNA transfection, i.e., under stepwise depletion of the RAD51 protein which is normally required to deal with such omnipresent endogenous lesions occurring during e.g. DNA replication or some oxidative/metabolic processes; 

      iii) The knockdown approach, while clearly robustly depleting RAD51 protein levels (see Supplementary Figure 5D) may nevertheless leave a small residual fraction of the RAD51 protein present in the cells, thereby possibly inhibiting the HDR pathway to a slightly lesser degree than the B02 inhibitor;

      iv) Additionally, it should be noted that the baseline levels of I-PpoI-induced PNAs formation are somewhat higher in the transfection experiments (i.e. when using any siRNA, even the nontargeting control siRNA), compared with the less ‘invasive’ experiments of simply adding a drug/solvent to the cell culture medium. This phenomenon adds to the commonly seen (over decades, by us and many others..) above-baseline transient stress in cells exposed to transfections, often causing even moderate transient DNA damage response. Specifically, in control experiments, the level of I-PpoI-induced PNAs was around 15% in cells transfected with non-targeting siRNA, while the comparable experiment of only I-PpoI induction under non-transfection conditions was around 10%. In other words, the somewhat enhanced baseline counts of I-PpoI-induced PNAs seen in the knock-down experiments compared with chemical inhibitor experiments reflect partly the shift of the total readout counts due to the different baseline counts. This, however, does not alter the observed overall trends that are consistent in both types of experiments.

      While the potential interpretation(s) of the above results are presented in the Discussion section of the revised manuscript, the full mechanistic elucidation of the impact of various experimental manipulations on the PNA formation during the cell cycle would require a dedicated follow-up study.

      (3) Several previous studies have shown the activation of the nucleolar ATM-mediated DNA damage response pathway by I-Ppol-induced DSBs in rDNA. What is the role of nucleolar ATM in the regulation of PNAs?

      We agree this is an important issue the solution of which (explained below) strengthens the mechanistic insights provided in our revised manuscript, and we are grateful to the reviewer for raising this question. To address this important point and even extend the scope from ATM also to ATR, we employed two small-molecule inhibitors of ATM (KU-60019 and KU55933) and also one inhibitor of ATR (VE-822), at concentrations commonly used in analogous studies in the DNA damage response field,  to examine their impact on rDNA damage/PNA formation induced by I-PpoI. The new data are shown in Figures 5A and B. We found that the inhibition of either of the two kinases alone, robustly reduced the number of nuclei with PNAs, indicating that the activity of each of these two DNA damage signaling kinases is required for the formation of I-PpoI-induced PNAs in response to rDNA damage. Future experiments should elucidate precisely which of the very wide range of ATM/ATR substrates and/or specific protein domains and amino acid residues are instrumental in this rDNA damage signaling pathway to induce the formation of PNAs.

      Reviewer #3 (Public Review): 

      Summary: 

      Hornofova et al. examined interactions between the nucleolus and promyelocytic leukemia nuclear bodies (PML-NBs) termed PML-nucleolar associations (PNAs). PNAs are found in a minor subset of cells, exist within distinct morphological subcategories, and are induced by cellular stressors including genotoxic damage. A systematic pharmacological investigation identified that compounds that inhibit RNA Polymerase 1 (RNAPI) and/or topoisomerase 1 or 2A caused the greatest proportion of cells with PNA. A specific RAD51 inhibitor (R02) impacted the number of cells exhibiting PNAs and PNA morphology. Genetic double-strand break (DSB) induction within the rDNA locus also induced PNA structures that were more prevalent when non-homologous end joining (NHEJ) was inhibited. 

      Strengths: 

      PNA are morphologically distinct and readily visualized. The imaging data are high quality, and rDNA is amenable to studying nuclear dynamics. Specific induction of rDNA damage is a strong addition to the non-specific pharmacological damage characterized early in the manuscript. These data nicely demonstrate that rDNA double-strand breaks undermine PNA formation. Figure 1 is a comprehensive examination and presents a compelling argument that RNAPI and/or TOP1, TOP2A inhibition promote PNA structures. 

      Weaknesses: 

      (1) The data are limited to fixed fluorescent microscopy of structures present in a minority of cells. Data are occasionally qualitative and/or based upon interpretation of dynamic events extrapolated from fixed imaging. This study would benefit from live imaging that captures PNA dynamics. 

      We fully agree with the reviewer that live-cell imaging is critical to adequately capture PNA formation and evolution dynamics. While the data presented in this manuscript are based on quantifications of fixed cell images, all these analyses are based on a detailed live-cell imaging examination of the dynamic behavior of PNAs that we reported in our orginal study on PNAs formation as a biological phenomenon (Imrichova et al. (doi: 10.18632/aging.102248. Epub 2019 Sep 7). 

      In the revised version of our present manuscript, we better highlight the live-cell imaging study, in the Introduction section and further point out that the previous dynamic study was based on imaging of human cells ectopically expressing PML-EGFP and B23-RFP. Last but not least, to help the readers of this manuscript to understand the dynamics of PNA evolution, we have now also added an improved schematic figure that better illustrates the temporal dynamics of PNA stage transitions (Figure 1A).

      (2) Cell cycle and cell division are not considered. Double-strand break repair is cell cycle dependent, and most experiments occur over days of treatment and recovery. It is unclear if the cultures are proliferating, or which cell cycle phase the cells are in at the time of analysis. It is also unclear if PNAs are repeatedly dissociating and reforming each cell division. 

      We agree that this is an important point. We previously published (Imrichova et al., doi: 10.18632/aging.102248) that exposure of RPE-1hTERT cells to doxorubicin caused cell cycle arrest and cellular senescence. In the revised manuscript, we added the analysis of how the I-PpoI-induced rDNA DSB affects the cell’s fate (Supplementary Figure 4J-N). Importantly, we found that most of the cells after I-PpoI-induced rDNA DSB also developed cellular senescence, and only 1–3% of cells eventually recovered from such rDNA stress to the extent that they were able to form colonies in a colony-forming assay. Thus, at the time of analysis, most of the cells were non-proliferating. 

      Additionally, in the revised manuscript, we included an analysis of the dependence of PNA formation on specific cell cycle phases (see Figures 6E–I and Supplementary Figure 6C–E). Generally, we found that PNAs can be present in G1/S/G2. Nevertheless, the probability of occurrence in a particular cell cycle phase is affected by the type of treatment. For example, after I-PpoI-induced rDNA damage, the PNAs are primarily present in G1. In contrast, after the sole knockdown of RAD51 or TOP2A, the PNAs are present in S/G2 with higher probability. 

      (3) The relationship of PNA morphologies (bowl, funnel, balloon, and PML-NDS) also remains unclear. It is possible that PNAs mature/progress through the distinct morphologies, and that morphological presentation is a readout of repair or damage in the rDNA locus. However, this is not formally addressed.  

      The reviewer is indeed correct in his/her interpretation of the PNA morphologies as a readout of the dynamic fate of the rDNA lesion. As mentioned in our response to the previous point no. 2 raised by this reviewer (see above), we described the dynamic structural PNA transitions in our previous article (Imrichova et al., doi: 10.18632/aging.102248).

      PNA progresses through distinct structures. Our results indicate that individual PNA subtypes are tied to specific processes. The PNA bowl-type is linked to the recognition of rDNA damage on the nucleolar periphery. The PNA funnel-type clusters several damaged rDNA loci from the nucleolus into PML-NDS, which is the ultimate structure that sequesters unrepaired rDNA away from the reactivated nucleolus.

      The formation of bowls, funnels, and balloons is linked to the inhibition of RNA polymerase I during the formation of nucleolar caps. In contrast, the later stage of PML-NDS is linked to RNA polymerase I reactivation. 

      We should mention that after the I-PpoI treatment, the ‘bowls’ and ‘funnels’ (observed originally in response to topoisomerase inhibitory drugs) are missing, and only PML-NDSs are formed. The apparent absence of the preceding stages of PNAs may reflect the lower extent of rDNA damage induced by I-PpoI treatment, without causing the pan-nucleolar RNA polymerase I inhibition that was observed for other treatments, such as doxorubicin.  

      (4) An I-Ppol targeted sequence within the rDNA locus suggests 3D structural rearrangement following damage. An orthogonal approach measuring rDNA 3D architecture would benefit comprehension.

      This is a very inspiring idea. Given the demanding nature of the required 3D analyses and the fact that this aspect is somewhat outside the scope of the present study, we plan to follow this issue up in our future work, along with our efforts to localize the individual NORs using immune-FISH after introducing the rDNA damage by I-PpoI.

      (5) Following I-Ppol induction, it is possible that cells arrest in a G1 state. This may explain why targeting NHEJ has a greater impact on the number of 53BP1 foci and should be investigated.

      We fully agree with the Reviewer. Indeed, our results showed that after a 24-hour period of I-PpoI induction, most cells (about 90%) are in the G1 phase of the cell cycle, consistent with the activation of the ATM/ATR checkpoint signaling and p53 activation that we observed. Therefore, this cell cycle effect can indeed explain why targeting NHEJ has a greater impact and causes the higher numbers of 53BP1 foci (and also yH2AX foci). 

      (6) Conclusions: PNAs are a phenomenon of biological significance and understanding that significance is of value. More work is required to advance knowledge in this area. The authors may wish to examine the literature on APBs (Alt-associated PML-NBs), which are similar structures where telomeres associate with PML-NBs in a specific subset of cancers. It is possible that APBs and PNAs share similar biology, and prior efforts on APBs may help guide future PNA studies.  

      We are very grateful for this stimulating suggestion. In the Discussion of the revised manuscript, we now address the possible analogy between the APBs under ALT on the one hand, and the PNA formation on rDNA damage studied here, on the other. The following is the quote of the relevant paragraph of the revised Discussion: 

      “There are several similarities between PNAs and APBs. The interaction partner of PML located on both the telomeres and rDNA must be sumoylated, as the PML-SIM domain is essential for the formation of both APBs and PNAs (37,93). The PML IV isoform most efficiently forms APBs and also PNAs (16,37). PML clusters damaged telomeres into APBs, and we observe that several NORs converge in one PNA structure; thus, the PML-dependent clustering of damaged NORs is plausible. On the other hand, there is one critical difference between the otherwise broadly analogous APBs and PNAs. The process of ALT operates in transformed cancer cells that do not express the telomerase, thus enabling telomere maintenance, cell proliferation, and immortalization (94,95). The PNAs, on the other hand, were primarily detected in non-transformed cells, and their formation is linked to cell cycle arrest and establishment of senescence (31,36). It remains to be determined whether the formation of PNAs is positively involved in rDNA repair, resulting in a return of at least some PNA-forming cells to the cell cycle, or if they play a role in blocking the repair of DNA double-stranded breaks on rDNA, broadly analogous to the shelterin complex on telomeres during replicative senescence (96). We propose that the pro-senescent role of PNAs may contribute to the maintenance of rDNA stability, thereby limiting the potential of hazardous genomic instability and, hence, the risk of cellular transformation. Analogous to checkpoint responses and oncogene-induced senescence (97,98) the PNA-associated senescence might provide one aspect of the multifaceted cell-autonomous anti-cancer barrier, in this case guarding the integrity of the most vulnerable repetitive rDNA loci, possibly at the expense of accumulated cellular senescence-associated decline of functional tissues during aging.”

      Our responses to recommendations from the Editors:

      (1) Since this paper does not provide a mechanistic insight into how the different PNA forms after DNA damage and PolI inhibition such as doxorubicin (DOXO) treatment and how HR modulates the PNA formation, it is very important to provide some experimental data for those. For example, as the #3 reviewer suggested, the time-lapse analysis of PML and a rDNA marker after DOXO treatment and recovery would be beneficial. with morphological analysis. 

      We fully agree that live-cell imaging is essential for a better understanding of the evolution and function of PNAs'. The requested time-lapse analysis on the dynamics of the PNA morphological stages after DOXO treatment and recovery is available to the Reviewers and readers in our previously published article that reported the PNA phenomenon and the basic live cell imaging data after doxorubicin treatment using the ectopically expressed PML-GFP and B23-RFP (Imrichova et al.; doi: 10.18632/aging.102248.). In our present revised manuscript, we now refer to this work in the Introduction and further stress that those data were based on live-cell imaging, to better highlight this point along the line recommended by the Reviewers. We have now also added an improved scheme that better explains the temporal dynamics of PNA transitions (Figure 1A).

      (2) In the same line as point #1, it is very important to show what kind of signaling pathway is necessary for PNA formation upon DSB formation with PolI inhibition. For example, as the #2 reviewer advised, the role of ATM or ATR could be tested by adding their inhibitor during the PNA formation. 

      Again, we fully agree that clarification of the signaling pathway required for PNA formation is crucial, and we are grateful for this stimulating recommendation. While the mentioned Reviewer no. 2 (in his/her Public comments) asked only about the role of ATM, the Editors rightly requested that we should use distinct inhibitors to test the respective roles of not only ATM but also ATR. As recommended, we have tested the importance of ATM and ATR kinase activities by inhibiting them during PNA formation. These newly generated data clearly showed that the activity of either kinase is essential for the efficient formation of PNA, thereby providing a significant new mechanistic insight in the revised dataset. In the manuscript, these new results are now shown in Figures. 5A and B. We also addressed this issue in the Public Review (Reviewer #2 point 3).

      (3) Given the association of PML body with telomeres in ALT cells (ALT-associated PML Body, APB) has been established well in the field, the authors need to mention this in the Introduction and also compare how PNA is similar to different from APB clearly in the Discussion.

      We have followed this conceptually important recommendation exactly as suggested: i) We now mention the ALT-associated PML Body (APB)  in the Introduction section (end of the second paragraph) and ii) In much more detail, we now compare the conceptual analogy in terms of similarities and differences between PNA and APB in the revised Discussion.  We also address this issue in the document Response to Public Review (Reviewer #3 point 6). Indeed, we agree that this comparison is very fitting in the context of our dataset and informative for the broad audience.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Major points. 

      (1) Any treatments shown in Figure 1B and 1C did not induce PNA in most of the cells with around 20% for a maximum value. What time point(s) the authors checked should be stated in the main text or the legend clearly. The authors need to mention the kinetics of different PNA classes and/or doseresponse effects at least for doxorubicin and BMH-21. Or a cell-cycle stage effect should be analyzed and/or discussed given that HR is mainly operating in S and G2 phases. 

      Thank you for pointing this out. We have now clarified the dose effects and also both analyzed and discussed the PNA formation vis a vis cell cycle stages, as recommended by this insightful reviewer.

      First, we have now added an experimental scheme to the Figures for better clarity regarding the time points examined, as suggested.

      Second, our results show that drug doses indeed affect the number and subtype of PNAs that form after such treatments. We show PNAs (types and number) after 0.5 – 5 – 50 µM camptothecin, topotecan, and etoposide (Supplementary Figure 1G and H) and after 0.375 – 0.56 – 0.75 µM doxorubicin (Figure 2A-D and Supplementary Figure 2E-G).  

      The very first detailed analysis of PNA evolution was presented in Imrichova et al. (doi: 10.18632/aging.102248.), where we described, using live-cell imaging, the relationship between the individual doxorubicin-induced PNA types, their transitions, and dynamics. We found that the highest number of nuclei with PNAs was present between 24 and 48 h after treatment initiation. Thus, we selected this time point for PNAs detection after treatments presented in Figure 1B.  

      We have now also added the distribution of nuclei based on the presence of specific PNA types into Supplementary Figure 1F.

      We included the analysis of the dependence of PNA formation on specific cell cycle phases (see Figures 6E–I). A very detailed explanation of the observed cell cycle effects is presented in the document Responses to Public Review, re. Reviewer nr. 2, point 2, so please kindly read our response there.

      (2) Although the induction of PNA by DSBs at rDNA repeats is clearly shown in the paper and modulated by DSB repair pathways, the biological significance of this sub-nuclear structure has not been addressed at all. Is the PNA required for efficient DSB repair per se or pathway choice? Moreover, the PNA kinetic is peculiar. Once formed, the PNA did not show any turnover even after the DNA-damaging agents were washed away (Figure 4H). This structure is succeeded into the next generation after cell division. Such dynamics of PNL should be carefully addressed. 

      The reviewer is correct in that the fate of the PNA and the potential biological significance of this phenomenon required a better explanation. The majority (≈97%) of cells after I-PpoI induction undergo cellular senescence, and therefore, we suppose that the PNA structures are not passed into the next cell cycle, as the bulk of the cells do not proliferate/cycle after such treatments. In this regard, it should be noted that PNAs (PML-NDS) are associated with replicative senescence of human mesenchymal stem cells (our old publication: Janderova-Rossmeislova 2007; doi: 10.1016/j.jsb.2007.02.008). To answer the comment of this reviewer, we have actually never observed that the cells with PNA present would be able to enter mitosis. Based on these findings, we suggest that damage to the repetitive rDNA loci, such as in our experiments in the form of DSBs, could commonly result in unsuccessful repair attempts leading to cellular senescence due to rDNA damage signaling, consistent with our new experiments highlighting the key role of the signaling mediated by the major DNA damage response kinases ATM and ATR, including the role of PNAs formation. For more details, please see also our response to Point 2 raised by the editors, on page 1 of this document, as well as our Public review response to Referee nr. 2, his/her points 2 and 3.

      From a broader perspective, relevant to the biological function of PNAs in this unorthodox cellular stress response, we showed that doxorubicin-induced PML-NDSs separate/sequester persistent rDNA DSBs from the regions of active pre-rRNA transcription. Again, the purpose of this process is not entirely clear at present. However, such separation of unrepaired rDNA from the rest of the genome could have a protective function, thereby limiting the risk of aberrant homologous recombination among hundreds of the repetitive, recombination-prone rDNA copies spread across five chromosomes. It should be stressed that PNAs are rarely seen in cancer cells, and their absence might be linked to the rDNA instability commonly seen in transformed cells. 

      As published in our previous study (Imrichova et al.; doi: 10.18632/aging.102248.), we followed the fate of individual PML-NDS (the last stage of PNA) after the recovery from doxorubicin treatment using live-cell imaging. We observed that the destiny of this structure could be diverse. Some of them sustained in the nucleus for many hours, but a portion of them disappeared. Their extinction may be a manifestation of successful rDNA repair. However, what remains unresolved is why these cells do not reenter the cell cycle and instead develop a senescent phenotype, possibly reflecting some paracrine effects of a cocktail of diverse cytokines and chemokines secreted by the neighboring cells, a phenomenon well established in the senescence field as SASP (senescence-associated secretory phenotype). 

      Notably, during the recovery phase from I-PpoI insult, some of the PML-NDS, in fact, increase in size over time (please refer to the graph in Author response image 1). This enlargement suggests ongoing processes within these structures. Additionally, the sequential accumulation of DHX9 (a multifunctional DNA/RNA helicase) in PNAs during recovery from the I-PpoI insult (as shown in Figure 4G and Supplementary Figure 4H in the revised manuscript) supports the hypothesis that PNAs are associated with as-yet poorly understood process(es). 

      Author response image 1.

      . A scatter plot shows the changes in PNA diameters during the recovery phase from a 24-hour-long expression of IPpoI.

      Last but not least, again relevant for the potential biological role of PNAs, we now also discuss the partial analogy of these structures with the PML-association with telomeres in cells that maintain their telomeres by the ALT recombinational process, as suggested by Referee no. 3 in the public review process. As this consideration addresses also the biological significance of the diverse PML associations and particularly our thoughts about the PNA, we copy/paste this paragraph from the Discussion section of our revised manuscript here, for the convenience of the Reviewer:

      “There are several similarities between PNAs and APBs. The interaction partner of PML located on both the telomeres and rDNA must be sumoylated, as the PML-SIM domain is essential for the formation of both APBs and PNAs (37,93). The PML IV isoform most efficiently forms APBs and also PNAs (16,37). PML clusters damaged telomeres into APBs, and we observe that several NORs converge in one PNA structure; thus, the PML-dependent clustering of damaged NORs is plausible. On the other hand, there is one critical difference between the otherwise broadly analogous APBs and PNAs. The process of ALT operates in transformed cancer cells that do not express the telomerase, thus enabling telomere maintenance, cell proliferation, and immortalization (94,95). The PNAs, on the other hand, were primarily detected in non-transformed cells, and their formation is linked to cell cycle arrest and establishment of senescence (31,36). It remains to be determined whether the formation of PNAs is positively involved in rDNA repair, resulting in a return of at least some PNA-forming cells to the cell cycle, or if they play a role in blocking the repair of DNA double-stranded breaks on rDNA, broadly analogous to the shelterin complex on telomeres during replicative senescence (96). We propose that the pro-senescent role of PNAs may contribute to the maintenance of rDNA stability, thereby limiting the potential of hazardous genomic instability and, hence, the risk of cellular transformation. Analogous to checkpoint responses and oncogene-induced senescence (97,98) the PNA-associated senescence might provide one aspect of the multifaceted cell-autonomous anti-cancer barrier, in this case guarding the integrity of the most vulnerable repetitive rDNA loci, possibly at the expense of accumulated cellular senescence-associated decline of functional tissues during aging.”

      (3) The association of PNA with DSB repair is shown by the colocalization with 53BP1 (Figures 3-5) and the kinetics of DSB repair were assessed by 53BP1 kinetics (Figure 5B). The authors need to check the colocalization of other DSB repair factors in homologous recombination (RPA and RAD51) and nonhomologous end joining (KU) and the kinetics of these DSB repair foci. 

      We are grateful for this very relevant suggestion. In response to this recommendation, we have examined additional markers, linked to homologous recombination. In Figures 6A—D and Supplementary Figures 6A and B, we now show also the localization of RAD51 and RPA32 (pS33), along the lines recommended by this Reviewer.

      (4) In Figure 5B, 53BP1 foci in the "nucleolus" should be shown with that in the nucleus. 

      In the revised manuscript, we show histograms with a count of 53BP1 foci per nucleus.

      (5) The authors often used the words, "difficult-to-repair" and "easy-to-repair" DNA lesions. However, without the nature of these DNA lesions, it is early to distinguish the lesions. So, the authors should avoid them in the title, abstract, results, and figure legends. In Discussion, it is free to use them with a logical explanation. 

      Thank you for the recommendation. We have now changed the term “difficult-to-repair” to “persistent rDNA damage”, as this term better describes at face value the scenario encountered in these experiments. In the new version of the manuscript, we have now emphasized that PNAs are formed as a late response to rDNA damage. We added the observation that PNAs colocalized with rDNA lesions accumulated in the nucleolar cap (periphery of nucleolus), which are probably in-compatible with NHEJ-mediated repair that otherwise occurs within the nucleolus. These persistent lesions contained phospho-RPA, a marker of resected DNA. However, RAD51 was not detected in such late lesions, indicating that the canonical RAD51-dependent HDR pathway is also restricted. Finally, we included a section defining such persistent DNA damage in the revised Discussion.

      Minor points: 

      (1) Page 5, second paragraph, line 6: "expression of PML". 

      (2) Page 5, line 6 from the bottom and Figure 1B: Actinomycin D is not a "specific" RNA polymerase I inhibitor. 

      (3) Page 6, first paragraph, last line: "DNA DSB" should be "DSB". 

      (4) Page 6, second paragraph, lines 6-7: What is the evidence of RNA polymerase I is active (need to explain to the readers)? 

      (5)  Figure 1D and main text: Please mention DOXO is the abbreviation of doxorubicin. 

      We are grateful for these points, which have now all been corrected in the revised version of the manuscript.

      (6) Page 6, third paragraph, line 4 and Figure 1D: What is "esi" not "si"TOP1. 

      In the revised manuscript, we explained what ‘esiRNA’ means; in fact, it is the pool of biologically prepared siRNAs targeting the mRNA of the protein being knocked down.

      (7) Figures 2A and 2B: The effect of B02 alone on PNA should be shown as a control.

      As recommended, the effect of B02 alone is now presented in Supplementary Figures 2A and B. 

      (8) Page 7, first paragraph, last three lines: It is hard to catch how the authors suggested the inhibition of RAD51 suppressed  RNAPI activity. If so, please  check the incorporation of 5FU. 

      Thank you for pointing out this confusing formulation. We have now removed from the revised manuscript the part of that original sentence: “which are predominantly associated with RNAPI inhibition”. 

      We observed that PML ‘balloons’ wrapped the nucleolus with the concomitantly observed complete inhibition of RNAPI in the nucleolus (Imrichova et al.; doi: 10.18632/aging.102248.). Nevertheless, we removed the original phrase from the revised version of the manuscript, as we agree with the reviewer that the causative relationship is so far lacking.

      (9) Page 7, second paragraph: It is critical to clarify what time B02 was added after DOXO removal or during DOXO treatment, or both.  

      We agree: In response we have now added the experimental scheme showing all these temporal details.

      (10) Figure 2H: The experiment lacks control with siTDP2 without etoposide treatment. 

      We did not include this control, unfortunately.

      (11) Page 8, third paragraph, line 3 from the bottom; "besides of rDNA probe, we also utilized probes" is better. 

      We changed this sentence in the revised manuscript, as recommended. 

      (12) Figure 3B: In these multi-color images, it is hard to see blue and gray in merged ones. It is better to show images with a single color. 

      We agree that grayscale is better to follow. However, this type of presentation would significantly increase the number of images, a circumstance we wished to avoid in this already rather image-heavy dataset. Instead, when it was possible, we elevated the intensity of fluorescence in colored images. The list of images with this adjustment is present in the public review. 

      We also inserted the example of the image in greyscale here as Author response image 2. 

      Author response image 2.

      The representative images nucleoli show the localization of 53BP1 (red; a marker of DNA DSB), PML (green, a marker of PML-NB or PNAs), rDNA (blue), and DJ (white; a marker of the acrocentric chromosome) after doxorubicin treatment (2 days) or in the recovery phase (1 and 4 days). The merge of all channels is shown together with the presentation of individual images in greyscale. Scale, 5 µm.  

      (13) Figure 4E: Please add values at D0. 

      We did not analyze the 53BP1 foci before adding Shield1 and doxycycline to induce the expression of I-PpoI (D0). However, as a control, we analyzed the 53BP1 foci in the cells treated for 24 h with the corresponding amount of DMSO as a mock treatment scenario (black line; NT).

      Reviewer #2 (Recommendations For The Authors): 

      (1) The data provided in this manuscript did not explicitly compare the easy-to-repair vs difficult-torepair DNA lesions in rDNA, or at least lack quantitative measures with statistical analysis. Therefore, the title may need to be revised accordingly. 

      We agree, and the title has now been revised to better capture the persistent nature of the rDNA damage that evokes the PNA formation. Please see the response to Reviewer #1, Major points 5, presented above in this document.

      Reviewer #3 (Recommendations For The Authors): 

      (1) Live imaging is paramount to understanding the dynamic nature of PNAs.  

      We agree that live-cell imaging is important. We have addressed this issue in detail in Response to Public review comments, of this Reviewer, as well as in the first point of this document in response to the Editors. In short, although the data presented in this manuscript are based on quantifications of fixed cell images, all these analyses benefit from our previous detailed live-cell imaging data that we reported – describing a careful examination of the dynamic behavior of PNAs in the study by Imrichova et al. (doi: 10.18632/aging.102248). To better illustrate the dynamic behavior of PNAs for the convenience of this reviewer, we include some data from our original article on this topic (referred to above): please see Author response image 3.

      Author response image 3.

      This Figure shows data published in Imrichova et al. (doi: 10.18632/aging.102248.). PML IV-EGFP was ectopically expressed in RPE-1hTERT cells. The localization of PML was followed using live cell imaging. (A) the bowl (in this work named cap) originates from the accumulation of diffuse PML. (B) The transition between bowl (named cap), funnel (named fork), and balloon (named circle). (C + D) PML IV-EGFP (green) and B23-RFP (red) were ectopically expressed in RPE-1hTERT cells. The localization of both proteins was followed by live cell imaging. C – The formation of PML-NDS from the funnel is shown; D – The entire PNA cycle is shown. (PML-bowl formed on the border of the nucleolus, then transformed into the PML-funnel, and finally into PML-NDS. 

      (2) The authors should consider cell cycle and cell proliferation in their analyses. 

      We are grateful for this recommendation, which echoes your own comment nr. 2 in the Public reviews document. Shortly, as we explained in the response to Public review, proliferation of PNA-containing cells is severely limited, as the vast majority of such cells enter a long-term arrest and cellular senescence. Furthermore, inspired by this comment, we have newly performed a series of experiments to address the frequencies of PNA formation vis a vis cell cycle phase position of the individual cells with rDNA damage. In the revised manuscript, we now include the data from these analyses: see Figures 6E–I and Supplementary Figures 6C–E. Our response in the Public Review provides a detailed description of these results.

      (3) Merged fluorescent micrographs in red and green are potentially not discernible to individuals with colour-vision deficiencies. Consider re-colouring into schemes that are more accessible. 

      We agree that some readers may have different preferences about fluorescence micrographs. Here, we used the classical combination of green and red, commonly employed in the field.

      (4) Single-colour fluorescent micrographs are easier to visualize in grey-scale. Whenever a single colour is shown, it will help reader comprehension if the images are shown in this manner. 

      As recommended, we have changed Figures 4C, F, and G from a single-color presentation to a greyscale. 

      (5) There are many long paragraphs that are difficult to digest. I suggest where possible breaking this text into smaller portions (e.g. Page 10, pages 13-14, page 16-17). 

      Thank you for pointing this out. We have now broken the text into smaller portions (in several places), as recommended.

      (6) The B02 and NU7441 data would be bolstered by genetic confirmation (depleting RAD51, BRCA2 or PALB2 for HR, DNA-PK or LIG4 for NHEJ).

      As recommended, we downregulated Rad51 and LIG4 by RNA interference. New data are presented in Figures 5F–I, 6E, and F, Supplementary Figures 5D, E, F–H, and Supplementary Figures 6C–E. The Public Review provides a detailed description of these results and the ensuing conclusions.

      (7) Microscopy results are often qualitative (Fig S1I, S2L, S3A) and need to be bolstered with quantitative data. 

      We appreciate this recommendation and have implemented quantifications in several important microscopy results, as follow:

      S1I: The quantification of the number of cells with types of PNAs after esiTOP1 is present in Supplementary Figure 1L

      S2L: The quantification (% of nuclei with PNAs) is in Figure 2H

      S3A: In this immuno-FISH figure, we captured nuclei with and w/o PNAs. Using the SQUASSH analysis, we identified size-based colocalization between rDNA–PML and DJ–PML presented in Supplementary Figure 3C.

      (8) Stats or error bars are missing (Fig 1D, 2H, S1C-E, S1F, S2A S2D-G, S3E, S4E).

      We apologize for those omissions and we have amended this aspect of the study in the revised manuscript as much as possible:

      Figure 1D: For AMD and doxorubicin and CX-5461 and doxorubicin treatments, three and two biological replicates are shown separately in the same graph, respectively. For AMD and the knockdown of TOP1, the mean from three biological replicates is shown. All these results indicate the elevation number of PNAs when RNAPI is inhibited.

      Figure 2H: The error bars are present. As for siTDP2 in all replicates, the number of cells was the same (4%). Therefore, the error bar is not visible.

      Supplementary Figure 1C-E: Unfortunately, only one replicate (for all treatments) was analyzed by western blotting.

      Supplementary Figure 1F (in revised manuscript SF1G): The error bars are present. By this graph, we mainly wanted to present the variation in PNAs types. 

      Supplementary Figure 2A (in revised manuscript SF2C): We include the whiskers 10-90 percentile and T-test.

      Supplementary Figure 2D-G (in revised manuscript SF2F-I): The error bars are present in all graphs. The changes in SF2F and G are not significant.

      Supplementary Figure 3E: This scheme shows the overlaps between rDNA and PML and rDNA and 53BP1. The collum graph based on these data is shown in Figure 3F.

      Supplementary Figure 4E: The plot profiles representing the mean fluorescence of PML and B23 are shown for different time points. 

      (9) PNA characteristics remind this reviewer of the well-described ALT-associated PML nuclear bodies (APBs) found in immortalized cells lacking telomerase (i.e. Alternative lengthening of telomeres). I recommend the authors look to published data on APBs to help guide how to approach their research within a framework of the cell cycle.

      We fully agree with this insightful comment, and have addressed this point in the Discussion section of the revised manuscript, quoted the relevant studies also in the Introduction, and indeed explained the parallels and also differences of PNA versus APB (see also our response to point 3 highlighted also by the Editors, early in this rebuttal document).  We have also addressed this issue in the Public Review (Reviewer #3 point 6). We agree with the reviewer that this comparison will be of wide interest to readers, given the potential insights into the biological roles of APBs and PNAs.

      For convenience, we copy/paste the relevant new paragraph of the Discussion here:

      “There are several similarities between PNAs and APBs. The interaction partner of PML located on both the telomeres and rDNA must be sumoylated, as the PML-SIM domain is essential for the formation of both APBs and PNAs (37,93). The PML IV isoform most efficiently forms APBs and also PNAs (16,37). PML clusters damaged telomeres into APBs, and we observe that several NORs converge in one PNA structure; thus, the PML-dependent clustering of damaged NORs is plausible. On the other hand, there is one critical difference between the otherwise broadly analogous APBs and PNAs. The process of ALT operates in transformed cancer cells that do not express the telomerase, thus enabling telomere maintenance, cell proliferation, and immortalization (94,95). The PNAs, on the other hand, were primarily detected in non-transformed cells, and their formation is linked to cell cycle arrest and establishment of senescence (31,36). It remains to be determined whether the formation of PNAs is positively involved in rDNA repair, resulting in a return of at least some PNA-forming cells to the cell cycle, or if they play a role in blocking the repair of DNA double-stranded breaks on rDNA, broadly analogous to the shelterin complex on telomeres during replicative senescence (96). We propose that the pro-senescent role of PNAs may contribute to the maintenance of rDNA stability, thereby limiting the potential of hazardous genomic instability and, hence, the risk of cellular transformation. Analogous to checkpoint responses and oncogene-induced senescence (97,98) the PNA-associated senescence might provide one aspect of the multifaceted cell-autonomous anti-cancer barrier, in this case guarding the integrity of the most vulnerable repetitive rDNA loci, possibly at the expense of accumulated cellular senescence-associated decline of functional tissues during aging.” 

      (10) Do PNAs mature/progress through the four distinct structures: bowl, to funnel, to balloon, and finally to PML-NDS. If true, this serves as a phenotypic read-out of damage induction (bowl) and repair (PML-NDs). It would suggest persistent unrepairable damage (0.56 or 0.75 uM doxorubicin) prevents repair leading to the formation of all the PNA structures except PML-NDs. While lower dose doxorubicin (0.375 uM) allows repair to occur, facilitating progression to the PML-ND state, which is then inhabited with B02. 

      Again, this is a very insightful comment. Indeed, as the Reviewer suggests and as we explained e.g., in our response to point 1 raised by this reviewer, PNA progresses through four distinct structures/maturation stages. Our results indicate that individual PNA subtypes are tied to specific processes. PNA bowl-type is linked to the recognition of rDNA damage on the nucleolar surface. The PNA of the funnel-type clusters several rDNA loci from the nucleolus into PML-NDS, which is the ultimate structure sequestering unrepaired rDNA away from the reactivated nucleolus.

      There is a negative correlation between doxorubicin dose and occurrence of PML-NDS, and, indeed, blocking HDR with BO2 combined with a lower doxorubicin dose results in a higher occurrence of all PNAs, including PML-NDS, emerged in the recovery phase. These findings indicate that the greater/more severe extent of rDNA damage, which is associated with RNAPI activity inhibition, is linked to PNAs types associated with RNAPI inhibition (originally published Imrichova et al. (doi: 10.18632/aging.102248.). In contrast, a milder degree of rDNA damage induces the formation of PMLNDS.

    1. eLife assessment

      This valuable study examines the activity and function of dorsomedial striatal neurons in estimating time. The authors used various causal and correlational techniques to investigate how these pathways collectively contribute to interval timing in mice and found that the direct and indirect striatal pathways perform opposing roles in processing elapsed time. The evidence is solid. The manuscript would interest neuroscientists examining how striatum contributes to behavior.

    2. Reviewer #1 (Public Review):

      Summary:<br /> In this work, the authors examine the activity and function of D1 and D2 MSNs in dorsomedial striatum (DMS) during an interval timing task. In this task, animals must first nose poke into a cued port on the left or right; if not rewarded after 6 seconds, they must switch to the other port. Thus, this task requires animals to estimate if at least 6 seconds have passed after the first nose poke. After verifying that animals estimate the passage of 6 seconds, the authors examine striatal activity during this interval. They report that D1-MSNs tend to decrease activity, while D2-MSNs increase activity, throughout this interval. They suggest that this activity follows a drift-diffusion model, in which activity increases (or decreases) to a threshold after which a decision is made. The authors next report that optogenetically inhibiting D1 or D2 MSNs, or pharmacologically blocking D1 and D2 receptors, increased the average wait time. This suggests that both D1 and D2 neurons contribute to the estimate of time, with a decrease in their activity corresponding to a decrease in the rate of 'drift' in their drift-diffusion model. Lastly, the authors examine MSN activity while pharmacologically inhibiting D1 or D2 receptors. The authors observe most recorded MSNs neurons decrease their activity over the interval, with the rate decreasing with D1/D2 receptor inhibition.

      Major strengths:<br /> The study employs a wide range of techniques - including animal behavioral training, electrophysiology, optogenetic manipulation, pharmacological manipulations, and computational modeling. The question posed by the authors - how striatal activity contributes to interval timing - is of importance to the field and has been the focus of many studies and labs. This paper contributes to that line of work by investigating whether D1 and D2 neurons have similar activity patterns during the timed interval, as might be expected based on prior work based on striatal manipulations. However, the authors find that D1 and D2 neurons have distinct activity patterns. They then provide a decision-making model that is consistent with all results. The data within the paper is presented very clearly, and the authors have done a nice job presenting the data in a transparent manner (e.g., showing individual cells and animals). Overall, the manuscript is relatively easy to read and clear, with sufficient detail given in most places regarding the experimental paradigm or analyses used.

      Major weaknesses:<br /> One weakness to me is the impact of identifying whether D1 and D2 had similar or different activity patterns. Does observing increasing/decreasing activity in D2 versus D1, or different activity patterns in D1 and D2, support one model of interval timing over another, or does it further support a more specific idea of how DMS contributes to interval timing?

      I found the results presented in Figures 2 and 3 to be a little confusing or misleading. In Figure 2, the authors appear to claim that D1 neurons decrease their activity over the time interval while D2 neurons increase activity. The authors use this result to suggest that D1/D2 activity patterns are different. In Figure 3, a different analysis is done, and this time D2 neurons do not significantly increase their activity with time, conflicting with Figure 2. While in both figures, there is a significant difference between the mean slopes across the population, the secondary effect of positive/negative slope for D2/D1 neurons changes. I find this especially confusing as the authors refer back to the positive/negative slope for D2/D1 neurons result throughout the rest of the text.

      It is a bit unclear to me how the authors chose the parameters for the model, and how well the model explains behavior is quantified. It seems that the authors didn't perform cross-validation across trials (i.e., they chose parameters that explained behavior across all trials combined, rather than choosing parameters from a subset of trials and determining whether those parameters are robust enough to explain behavior on held-out trials). I think this would increase the robustness of the result.

      In addition, it remains a bit unclear to me how the authors changed the specific parameters they did to model the optogenetic manipulation. It seems these parameters were chosen because they fit the manipulation data. This makes me wonder if this model is flexible enough that there is almost always a set of parameters that would explain any experimental result; in other words, I'm not sure this model has high explanatory power.

      Lastly, the results are based on a relatively small dataset (tens of cells).

      Impact:<br /> The task and data presented by the authors are very intriguing, and there are many groups interested in how striatal activity contributes to the neural perception of time. The authors perform a wide variety of experiments and analysis to examine how DMS activity influences time perception during an interval-timing task, allowing for insight into this process. However, the significance of the key finding -- that D1 and D2 activity is distinct across time -- remains somewhat ambiguous to me.

    3. Reviewer #2 (Public Review):

      (1) Regarding the results in Figure 2 and Figure 5: for the heatmaps in Fig.2F and Fig.2E, the overall activity pattern of D1 and D2 MSNs looks very similar, both D1 and D2 MSNs contains neurons showing decreasing or increasing activity during interval timing. And the optogenetic and pharmacologic inhibition of either D1 or D2 MSNs resulted in similar behavior outcomes. To me, the D1 and D2 MSN activities were more complementary than opposing. If the authors want to emphasize the opposing side of D1 and D2 MSNs, then the manipulation experiments need to be re-designed, since the average activity of D2 MSNs increased, while D1 MSNs decreased during interval timing, instead of using inhibitory manipulations in both pathways, the authors should use inhibitory manipulation in D2-MSNs, while using optogenetic or pharmacology to activate D1-MSNs. In this way, the authors can demonstrate the opposing role of D1 and D2 MSNs and the functions of increased activity in D2-MSNs and decreased activity in D1-MSNs.

      (2) Regarding the results in Figure 3 C and D, Figure 6 H and Figure 7 D, what is the sample size? From the single data points in the figures, it seems that the authors were using the number of cells to do statistical tests and plot the figures. For example, Figure 3 C, if the authors use n= 32 D2 MSNs and n= 41D1 MSNs to do the statistical test, it could make a small difference to be statistically significant. The authors should use the number of mice to do the statistical tests.

      (3) Regarding the results in Figure 5, what is the reason for the increase in the response times? The authors should plot the position track during intervals (0-6 s) with or without optogenetic or pharmacologic inhibition. The authors can check Figures 3, 5, and 6 in the paper https://doi.org/10.1016/j.cell.2016.06.032 for reference to analyze the data.

    4. Reviewer #3 (Public Review):

      Summary:<br /> The cognitive striatum, also known as the dorsomedial striatum, receives input from brain regions involved in high-level cognition and plays a crucial role in processing cognitive information. However, despite its importance, the extent to which different projection pathways of the striatum contribute to this information processing remains unclear. In this paper, Bruce et al. conducted a study using various causal and correlational techniques to investigate how these pathways collectively contribute to interval timing in mice. Their results were consistent with previous research, showing that the direct and indirect striatal pathways perform opposing roles in processing elapsed time. Based on their findings, the authors proposed a revised computational model in which two separate accumulators track evidence for elapsed time in opposing directions. These results have significant implications for understanding the neural mechanisms underlying cognitive impairment in neurological and psychiatric disorders, as disruptions in the balance between direct and indirect pathway activity are commonly observed in such conditions.

      Strengths:<br /> The authors employed a well-established approach to study interval timing and employed optogenetic tagging to observe the behavior of specific cell types in the striatum. Additionally, the authors utilized two complementary techniques to assess the impact of manipulating the activity of these pathways on behavior. Finally, the authors utilized their experimental findings to enhance the theoretical comprehension of interval timing using a computational model.

      Weaknesses:<br /> The behavioral task used in this study is best suited for investigating elapsed time perception, rather than interval timing. Timing bisection tasks are often employed to study interval timing in humans and animals. In the optogenetic experiment, the laser was kept on for too long (18 seconds) at high power (12 mW). This has been shown to cause adverse effects on population activity (for example, through heating the tissue) that are not necessarily related to their function during the task epochs. Given the systemic delivery of pharmacological interventions, it is difficult to conclude that the effects are specific to the dorsomedial striatum. Future studies should use the local infusion of drugs into the dorsomedial striatum.

      Comments on revised version:

      Thank you for the comprehensive revisions. Most of my (addressable) concerns were addressed. The current version of your manuscript appears significantly improved.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      In this work, the authors examine the activity and function of D1 and D2 MSNs in dorsomedial striatum (DMS) during an interval timing task. In this task, animals must first nose poke into a cued port on the left or right; if not rewarded after 6 seconds, they must switch to the other port. Critically, this task thus requires animals to estimate if at least 6 seconds have passed after the first nose poke - this is the key aspect of the task focused on here. After verifying that animals reliably estimate the passage of 6 seconds by leaving on average after 9 seconds, the authors examine striatal activity during this interval. They report that D1-MSNs tend to decrease activity, while D2-MSNs increase activity, throughout this interval. They suggest that this activity follows a drift-diffusion model, in which activity increases (or decreases) to a threshold after which a decision (to leave) is made. The authors next report that optogenetically inhibiting D1 or D2 MSNs, or pharmacologically blocking D1 and D2 receptors, increased the average wait time of the animals to 10 seconds on average. This suggests that both D1 and D2 neurons contribute to the estimate of time, with a decrease in their activity corresponding to a decrease in the rate of

      'drift' in their drift-diffusion model. Lastly, the authors examine MSN activity while pharmacologically inhibiting D1 or D2 receptors. The authors observe most recorded MSNs neurons decrease their activity over the interval, with the rate decreasing with D1/D2 receptor inhibition. 

      Major strengths: 

      The study employs a wide range of techniques - including animal behavioral training, electrophysiology, optogenetic manipulation, pharmacological manipulations, and computational modeling. The behavioral task used by the authors is quite interesting and a nice way to probe interval timing in rodents. The question posed by the authors - how striatal activity contributes to interval timing - is of importance to the field and has been the focus of many studies and labs; thus, this paper can meaningfully contribute to that conversation. The data within the paper is presented very clearly, and the authors have done a nice job presenting the data in a transparent manner (e.g., showing individual cells and animals). Overall, the manuscript is relatively easy to read and clear, with sufficient detail given in most places regarding the experimental paradigm or analyses used. 

      We are glad our main points came through to the reviewer.  

      Major weaknesses: 

      I perceive two major weaknesses. The first is the impact or contextualization of their results in terms of the results of the field more broadly. More specifically, it was not clear to me how the authors are interpreting the striatal activity in the context of what others have observed during interval timing tasks. In other words - what was the hypothesis going into this experiment? Does observing increasing/decreasing activity in D2 versus D1 support one model of interval timing over another, or does it further support a more specific idea of how DMS contributes to interval timing? Or was the main question that we didn't know if D2 or D1 neurons had differential activity during interval timing? 

      This is a helpful comment. Our hypothesis is that D1 and D2 MSNs had similar patterns of activity.  Our rationale is prior behavioral work from our group describing that blocking striatal D1 and D2 dopamine receptors had similar behavioral effects on interval timing (De Corte et al., 2019; Stutt et al., 2023), We rewrote our introduction with this idea in mind (Line 89)

      “We and others have found that striatal MSNs encode time across multiple intervals by time-dependent ramping activity or monotonic changes in firing rate across a temporal interval (Emmons et al., 2017; Gouvea et al., 2015; Mello et al., 2015; Wang et al., 2018). However, the respective roles of D2-MSNs and D1-MSNs are unknown. Past work has shown that disrupting either D2-dopamine receptors (D2) or D1-dopamine receptors (D1) powerfully impairs interval timing by increasing estimates of elapsed time (Drew et al., 2007; Meck, 2006). Similar behavioral effects were found with systemic (Stutt et al., 2024) or local dorsomedial striatal D2 or D1 disruption (De Corte et al., 2019a). These data lead to the hypothesis that D2 MSNs and D1 MSNs have similar patterns of ramping activity across a temporal interval. 

      We tested this hypothesis with a combination of optogenetics, neuronal ensemble recording, computational modeling, and behavioral pharmacology. We use a well-described mouse-optimized interval timing task (Balci et al., 2008; Bruce et al., 2021; Larson et al., 2022; Stutt et al., 2024; Tosun et al., 2016; Weber et al., 2023). Strikingly, optogenetic tagging of D2-MSNs and D1-MSNs revealed distinct neuronal dynamics, with D2-MSNs tending to increase firing over an interval and D1-MSNs tending to decrease firing over the same interval, similar to opposing movement dynamics (Cruz et al., 2022; Kravitz et al., 2010; Tecuapetla et al., 2016). MSN dynamics helped construct and constrain a four-parameter drift-diffusion computational model of interval timing, which predicted that disrupting either D2MSNs or D1-MSNs would increase interval timing response times. Accordingly, we found that optogenetic inhibition of either D2-MSNs or D1-MSNs increased interval timing response times. Furthermore, pharmacological blockade of either D2- or D1receptors also increased response times and degraded trial-by-trial temporal decoding from MSN ensembles. Thus, D2-MSNs and D1-MSNs have opposing temporal dynamics yet disrupting either MSN type produced similar effects on behavior. These data demonstrate how striatal pathways play complementary roles in elementary cognitive operations and are highly relevant for understanding the pathophysiology of human diseases and therapies targeting the striatum.”

      In the second, I felt that some of the conclusions suggested by the authors don't seem entirely supported by the data they present, or the data presented suggests a slightly more complicated story. Below I provide additional detail on some of these instances. 

      Regarding the results presented in Figures 2 and 3: 

      I am not sure the PC analysis adds much to the interpretation, and potentially unnecessarily complicates things. In particular, running PCA on a matrix of noisy data that is smoothed with a Gaussian will often return PCs similar to what is observed by the authors, with the first PC being a line up/down, the 2nd PC being a parabola that is up/down, etc. Thus, I'm not sure that there is much to be interpreted by the specific shape of the PCs here. 

      We are glad the reviewer raised this point. First, regarding the components in noisy data, what the reviewer says is correct, but usually, the variance explained by PC1 is small. This is the reason we include scree plots in our PC analysis (Fig 3B and Fig 6G). When we compare our PC1s to variance explained in random data, our PC1 variance is always stronger. We have now included this in our manuscript:

      First, we generated random data and examined how much variance PC1 might generate. 

      We added this to the methods (Line 634)

      “The variance of PC1 was empirically compared against data generated from 1000 iterations of data from random timestamps with identical bins and kernel density estimates. Average plots were shown with Gaussian smoothing for plotting purposes only.”

      These data suggested that our PC1 was stronger than that observed in random data (Line 183):

      “PCA identified time-dependent ramping activity as PC1 (Fig 3A), a key temporal signal that explained 54% of variance among tagged MSNs (Fig 3B; variance for PC1 p = 0.009 vs 46 (44-49)% variance for PC1 derived from random data; Narayanan, 2016).”

      And in the pharmacology data (Line 367):

      “The first component (PC1), which explained 54% of neuronal variance, exhibited “time-dependent ramping”, or monotonic changes over the 6 second interval immediately after trial start (Fig 6F-G; variance for PC1 p = 0.001 vs 46 (45-47)% variance in random data; Narayanan, 2016).”

      Second, we note that we have used this analysis extensively in the past, and PC1 has always been identified as a linear ramping in our work and in work by others (Line 179):

      “Work by our group and others has uniformly identified PC1 as a linear component among corticostriatal neuronal ensembles during interval timing (Bruce et al., 2021; Emmons et al., 2020, 2019, 2017; Kim et al., 2017a; Narayanan et al., 2013; Narayanan and Laubach, 2009; Parker et al., 2014; Wang et al., 2018).”

      Third, we find that PC1 is highly correlated to the GLM slope (Line 205):

      “Trial-by-trial GLM slope was correlated with PC1 scores in Fig 3A-C (PC1 scores vs. GLM slope r = -0.60, p = 10-8).”

      Fourth, our goal was not to heavily interpret PC1 – but to compare D1 vs. D2 MSNs, or compare population responses to D2/D1 pharmacology. We have now made this clear in introducing PCA analyses in the results (Line 177):

      “To quantify differences in D2-MSNs vs D1-MSNs, we turned to principal component analysis (PCA), a data-driven tool to capture the diversity of neuronal activity (Kim et al., 2017a).”

      Finally, despite these arguments the reviewer’s point is well taken. Accordingly, we have removed all analyses of PC2 from the manuscript which may have been overly interpretative. 

      We have now removed language that interpreted the components, and we now find the discussion of PC1 much more data-driven. We have also removed much of the advanced PC analysis in Figure S9. Given our extensive past work using this exact analysis of PC1, we think PCA adds a considerable amount to our manuscript justified as the reviewer suggested. 

      I think an alternative analysis that might be both easier and more informative is to compute the slope of the activity of each neuron across the 6 seconds. This would allow the authors to quantify how many neurons increase or decrease their activity much like what is shown in Figure 2.  

      We agree – we now do exactly this analysis in Figure 3D. We now clarify this in detail, using the reviewer’s language to the methods (Line 648):

      “To measure time-related ramping over the first 6 seconds of the interval, we used trial-by-trial generalized linear models (GLMs) at the individual neuron level in which the response variable was firing rate and the predictor variable was time in the interval or nosepoke rate (Shimazaki and Shinomoto, 2007). For each neuron, it’s time-related “ramping” slope was derived from the GLM fit of firing rate vs time in the interval, for all trials per neuron. All GLMs were run at a trial-by-trial level to avoid effects of trial averaging (Latimer et al., 2015) as in our past work (Bruce et al., 2021; Emmons et al., 2017; Kim et al., 2017b).”

      And to the results (Line 194):

      “To interrogate these dynamics at a trial-by-trial level, we calculated the linear slope of D2-MSN and D1-MSN activity over the first 6 seconds of each trial using generalized linear modeling (GLM) of effects of time in the interval vs trial-by-trial firing rate (Latimer et al., 2015).”

      Relatedly, it seems that the data shown in Figure 2D *doesn't* support the authors' main claim regarding D2/D1 MSNs increasing/decreasing their activity, as the trial-by-trial slope is near 0 for both cell types. 

      This likely refers to Figure 3D. The reviewer is correct that the changes in slope are small and near 0. Our goal was to show that D2-MSN and D1-MSN slopes were distinct – rather than increasing and decreasing. We have added this to the abstract (Line 46)

      “We found that D2-MSNs and D1-MSNs exhibited distinct dynamics over temporal intervals as quantified by principal component analyses and trial-by-trial generalized linear models.”

      We have clarified this idea in our hypothesis (Line 96):

      “These data led to the hypothesis that D2 MSNs and D1 MSNs have similar patterns of ramping activity across a temporal interval.”

      We have added this idea to the results (Line 194)

      “To interrogate these dynamics at a trial-by-trial level, we calculated the linear slope of D2-MSN and D1-MSN activity over the first 6 seconds of each trial using generalized linear modeling (GLM) of effects of time in the interval vs trial-by-trial firing rate (Latimer et al., 2015). Nosepokes were included as a regressor for movement. GLM analysis also demonstrated that D2-MSNs had significantly different slopes (-0.01 spikes/second (-0.10 – 0.10)), which were distinct from D1MSNs (-0.20 (-0.47– -0.06; Fig 3D; F = 8.9, p = 0.004 accounting for variance between mice (Fig S3B); Cohen’s d = 0.8; power = 0.98; no reliable effect of sex (F = 0.02, p = 0.88) or switching direction (F = 1.72, p = 0.19)). We found that D2-MSNs and D1-MSNs had significantly different slopes even when excluding outliers (4 outliers excluded outside of 95% confidence intervals; F = 7.51, p = 0.008 accounting for variance between mice) and when the interval was defined as the time between trial start and the switch response on a trial-by-trial basis for each neuron (F = 4.3, p = 0.04 accounting for variance between mice). Trial-by-trial GLM slope was correlated with PC1 scores in Fig 3A-C (PC1 scores vs. GLM slope r = -0.60, p = 108). These data demonstrate that D2-MSNs and D1-MSNs had distinct slopes of firing rate across the interval and were consistent with analyses of average activity and PC1, which exhibited time-related ramping.”

      And Line 215:

      “In summary, we used optogenetic tagging to record from D2-MSNs and D1-MSNs during interval timing. Analyses of average activity, PC1, and trial-by-trial firingrate slopes over the interval provide convergent evidence that D2-MSNs and D1MSNs had distinct and opposing dynamics during interval timing. These data provide insight into temporal processing by striatal MSNs.”

      And in the discussion (Line 415):

      “We describe how striatal MSNs work together in complementary ways to encode an elementary cognitive process, interval timing. Strikingly, optogenetic tagging showed that D2-MSNs and D1-MSNs had distinct dynamics during interval timing. “

      We have now included a new plot with box plots to make the differences in Figure 3D clear

      Other reviewers requested additional qualitative descriptions of our data, and we have referred to increases / decreases in this context. 

      Regarding the results in Figure 4: 

      The authors suggest that their data is consistent with a drift-diffusion model. However, it is unclear how well the output from the model fits the activity from neurons the authors recorded. Relatedly, it is unclear how the parameters were chosen for the D1/D2 versions of this model. I think that an alternate approach that would answer these questions is to fit the model to each cell, and then examine the best-fit parameters, as well as the ability of the model to predict activity on trials held out from the fitting process. This would provide a more rigorous method to identify the best parameters and would directly quantify how well the model captures the data. 

      We are glad the reviewer raised these points. Our goal was to use neuronal activity to fit behavioral activity, not the reverse. While we understand the reviewer’s point, we note that one behavioral output (switch time) can be encoded by many patterns of neuronal activity; thus, we are not sure we can use the model developed for behavior to fit diverse neuronal activity, or an ensemble of neurons. We have made this clear in the manuscript (Line 251):

      “Our model aimed to fit statistical properties of mouse behavioral responses while incorporating MSN network dynamics. However, the model does not attempt to fit individual neurons’ activity, because our model predicts a single behavioral parameter – switch time – that can be caused by the aggregation of diverse neuronal activity.”

      To attempt to do something close to what the reviewer suggested, we attempted to predict behavior directly from neuronal ensembles.  We have now made this clear in the methods on Line 682):

      “Analysis and modeling of mouse MSN-ensemble recordings. Our preliminary analysis found that, for sufficiently large number of neurons (𝑵 > 𝟏𝟏), each recorded ensemble of MSNs on a trial-by-trial basis could predict when mice would respond. We took the following approach: First, for each MSN, we convolved its trial-by-trial spike train 𝑺𝒑𝒌(𝒕) with a 1-second exponential kernel 𝑲(𝒕) = 𝒘 𝒆-𝒕/𝒘 if 𝒕 > 𝟎 and 𝑲(𝒕) = 𝟎 if 𝒕 ≤ 𝟎 (Zhou et al., 2018; here 𝒘 = 𝟏 𝒔). Therefore, the smoothed, convolved spiking activity of neuron 𝒋 (𝒋 = 𝟏, 𝟐, … 𝑵),

      tracks and accumulates the most recent (one second, in average) firing-rate history of the 𝒋-th MSN, up to moment 𝒕. We hypothesized that the ensemble activity

      (𝒙𝟏(𝒕), 𝒙𝟐(𝒕), … , 𝒙𝑵(𝒕)), weighted with some weights 𝜷𝒋 , could predict the trial switch time 𝒕∗ by considering the sum

      and the sigmoid 

      that approximates the firing rate of an output unit. Here parameter 𝒌   indicates how fast 𝒙(𝒕) crosses the threshold 0.5 coming from below (if 𝒌 > 𝟎) or coming from above (if 𝒌 < 𝟎) and relates the weights 𝜷𝒋 to the unknowns 𝜷H𝒋 \= 𝜷𝒋/𝒌 and 𝜷H𝟎 \= −𝟎. 𝟓/𝒌. Next, we ran a logistic fit for every trial for a given mouse over the spike count predictor matrix 7𝒙𝟏(𝒕), 𝒙𝟐(𝒕), … , 𝒙𝑵(𝒕)9 from the mouse MSN recorded ensemble, and observed value 𝒕∗, estimating the coefficients 𝜷H𝟎 and 𝜷H𝒋, and so, implicitly, the weights 𝜷𝒋. From there, we compute the predicted switch time 𝒕∗𝒑𝒓𝒆𝒅 by condition 𝒙(𝒕) = 𝟎. 𝟓. Accuracy was quantified comparing the predicted accuracy within a 1 second window to switch time on a trial-by-trial basis (Fig S4).

      And in the results (Line 254): 

      We first analyzed trial-based aggregated activity of MSN recordings from each mouse (𝒙𝒋(𝒕)) where 𝒋 = 𝟏, … , 𝑵 neurons. For D2-MSN or D1-MSN ensembles of 𝑵 > 𝟏𝟏, we found linear combinations of their neuronal activities, with some 𝜷𝒋 coefficients,

      that could predict the trial-by-trial switch response times (accuracy > 90%, Fig S4; compared with < 20% accuracy for Poisson-generated spikes of same trial-average firing rate). The predicted switch time 𝒕∗𝒑𝒓𝒆𝒅 was defined by the time when the weighted ensemble activity 𝒙(𝒕) first reached the value 𝒙) = 0.5. Finally, we built DDMs to account for this opposing trend (increasing vs decreasing) of MSN dynamics and for ensemble threshold behavior defining 𝒕∗𝒑𝒓𝒆𝒅; see the resulting model (Equations 1-3) and its simulations (Figure 4A-B).”

      And we have added a new figure, Figure S4, that demonstrates these trial-by-trial predictions of switch response times.  

      Note that we have included predictions from shuffled data similar to what the reviewer suggested based on shuffled data. Predictions are derived from neuronal ensembles on that trial; thus we could not apply a leave-one-out approach to trial-by-trial predictions.

      These models are highly predictive for larger ensembles and poorly predictive for smaller ensembles.  We think this model adds to the manuscript and we are glad the reviewer suggested it. 

      Relatedly, looking at the raw data in Figure 2, it seems that many neurons either fire at the beginning or end of the interval, with more neurons firing at the end, and more firing at the beginning, for D2/D1 neurons respectively. Thus, it's not clear to me whether the drift-diffusion model is a good model of activity. Or, perhaps the model is supposed to be related to the aggregate activity of all D1/D2 neurons? (If so, this should be made more explicit. The comment about fitting the model directly to the data also still stands).  

      Our model was inspired by the aggregate activity.  We have now made this clear in the results (Line 227): 

      “Our data demonstrate that D2-MSNs and D1-MSNs have opposite activity patterns. However, past computational models of interval timing have relied on drift-diffusion dynamics with a positive slope that accumulates evidence over time (Nguyen et al., 2020; Simen et al., 2011). To reconcile how these MSNs might complement to effect temporal control of action, we constructed a four-parameter drift-diffusion model (DDM). Our goal was to construct a DDM inspired by average differences in D2MSNs and D1-MSNs that predicted switch-response time behavior.”

      Further, it's unclear to me how, or why, the authors changed the specific parameters they used to model the optogenetic manipulation. Were these parameters chosen because they fit the manipulation data? This I don't think is in itself an issue, but perhaps should be clearly stated, because otherwise it sounds a bit odd given the parameter changes are so specific. It is also not clear to me why the noise in the diffusion process would be expected to change with increased inhibition. 

      We have clarified that our parameters were chosen to best fit behavior (Line 266):

      “The model’s parameters were chosen to fit the distribution of switch-response times:

      𝑭 = 𝟏, 𝒃 = 𝟎. 𝟓𝟐 (so 𝑻 = 𝟎. 𝟖𝟕), 𝑫 = 𝟎. 𝟏𝟑𝟓, 𝝈 = 𝟎. 𝟎𝟓𝟐 for intact D2-MSNs (Fig 4A, in black); and  𝑭 = 𝟎, 𝒃 = 𝟎. 𝟒𝟖 (so 𝑻 = 𝟎. 𝟏𝟑), 𝑫 = 𝟎. 𝟏𝟒𝟏, 𝝈 = 𝟎. 𝟎𝟓𝟐 for intact D1-MSNs (Fig 4B, in black).”

      Furthermore, we have clarified the approach to noise in the results (Line 247):  

      “The drift, together with noise 𝝃(𝒕) (of zero mean and strength 𝝈), leads to fluctuating accumulation which eventually crosses a threshold 𝑻 (see Equation 3; Fig 4A-B).”

      And Line 279: 

      “The results were obtained by simultaneously decreasing the drift rate D  (equivalent to lengthening the neurons’ integration time constant) and lowering the level of network noise 𝝈: D = 𝟎. 𝟏𝟐𝟗, 𝝈 = 𝟎. 𝟎𝟒𝟑 for D2-MSNs in Fig 4A (in red; changes in noise had to accompany changes in drift rate to preserve switch response time variance); and 𝑫 = 𝟎. 𝟏𝟐𝟐, 𝝈 = 𝟎. 𝟎𝟒𝟑  for D1-MSNs in Fig 4B (in blue). The model predicted that disrupting either D2-MSNs or D1-MSNs would increase switch response times (Fig 4C and Fig 4D) and would shift MSN dynamics.”

      Regarding the results in Figure 6: 

      My comments regarding the interpretation of PCs in Figure 2 apply here as well. In addition, I am not sure that examining PC2 adds much here, given that the authors didn't examine such nonlinear changes earlier in the paper. 

      We agree – we removed PC2 for these reasons. We have also noted that the primary reason for PC1 was to compare results of D2/D1 blockade (Line 362):

      “We noticed differences in MSN activity across the interval with D2 blockade and D1 blockade at the individual MSN level (Fig 6B-D) as well as at the population level (Fig 6E). We used PCA to quantify effects of D2 blockade or D1 blockade (Bruce et al., 2021; Emmons et al., 2017; Kim et al., 2017a). We constructed principal components (PC) from z-scored peri-event time histograms of firing rate from saline, D2 blockade, and D1 blockade sessions for all mice together. The first component (PC1), which explained 54% of neuronal variance, exhibited “timedependent ramping”, or monotonic changes over the 6 second interval immediately after trial start (Fig 6F-G; variance for PC1 p = 0.001 vs 46 (45-47)% variance in random data; Narayanan, 2016).”

      As noted above, PC1 does not explain this level of variance in noisy data.

      We also reworked Figure 6 to make the effects of D2 and D1 blockade more apparent by moving the matched sorting to the main figure: 

      A larger concern though that seems potentially at odds with the authors' interpretation is that there seems to be very little change in the firing pattern after D1 or D2 blockade. I see that in Figure 6F the authors suggest that many cells slope down (and thus, presumably, they are recoding more D1 cells), and that this change in slope is decreased, but this effect is not apparent in Figure 6C, and Figure 6B shows an example of a cell that seems to fire in the opposite direction (increase activity). I think it would help to show some (more) individual examples that demonstrate the summary effect shown by the authors, and perhaps the authors can comment on the robustness (or the variability) of this result. 

      These are important suggestions, we changed our analysis to better capture the variability and main effects in the data, exactly as the reviewer suggested. First, we now included 3 individual raster examples, exactly as the reviewer suggested

      As the reviewer suggested, we wanted to compare variability for *all* MSNs. We sorted the same MSNs across saline, D2 blockade, and D1 blockade sessions. We detailed these sorting details in the methods (Line 618):

      “Single-unit recordings were made using a multi-electrode recording system (Open

      Ephys, Atlanta, GA). After the experiments, Plexon Offline Sorter (Plexon, Dallas, TX), was used to remove artifacts. Principal component analysis (PCA) and waveform shape were used for spike sorting. Single units were defined as those 1) having a consistent waveform shape, 2) being a separable cluster in PCA space, and 3) having a consistent refractory period of at least 2 milliseconds in interspike interval histograms. The same MSNs were sorted across saline, D2 blockade, and D1 blockade sessions by loading all sessions simultaneously in Offline Sorter and sorted using the preceding criteria. MSNs had to have consistent firing in all sessions to be included. Sorting integrity across sessions was quantified by comparing waveform similarity via correlation coefficients between sessions.”

      To confirm that we were able to track neurons across sessions, we quantified waveform similarity (Line 353):

      “We analyzed 99 MSNs in sessions with saline, D2 blockade, and D1 blockade. We matched MSNs across sessions based on waveform and interspike intervals; waveforms were highly similar across sessions (correlation coefficient between matched MSN waveforms: saline vs D2 blockade r = 1.00 (0.99 – 1.00 rank sum vs correlations in unmatched waveforms p = 3x10-44; waveforms; saline vs D1 blockade r = 1.00 (1.00 – 1.00), rank sum vs correlations in unmatched waveforms p = 4x10-50). There were no consistent changes in MSN average firing rate with D2 blockade or D1 blockade (F = 1.1, p = 0.30 accounting for variance between MSNs; saline: 5.2 (3.3 – 8.6) Hz; D2 blockade 5.1 (2.7 – 8.0) Hz; F = 2.2, p = 0.14; D1 blockade 4.9 (2.4 – 7.8) Hz).”

      As noted above, this enabled us to compare activity for the same MSNs across sessions in a new Figure 6 (previously, this analysis had been in Figure S9), and used PCA to quantify this variability.

      By tracking neurons across saline, D2 blockade, and D1 blockade, readers can see all the variability in MSNs. We added these data to the results (Line 362):  

      “We noticed differences in MSN activity across the interval with D2 blockade and D1 blockade at the individual MSN level (Fig 6B-D) as well as at the population level (Fig 6E). We used PCA to quantify effects of D2 blockade or D1 blockade (Bruce et al., 2021; Emmons et al., 2017; Kim et al., 2017a). We constructed principal components (PC) from z-scored peri-event time histograms of firing rate from saline, D2 blockade, and D1 blockade sessions for all mice together. The first component (PC1), which explained 54% of neuronal variance, exhibited “timedependent ramping”, or monotonic changes over the 6 second interval immediately after trial start (Fig 6F-G; variance for PC1 p = 0.001 vs 46 (45-47)% variance in random data; Narayanan, 2016). Interestingly, PC1 scores shifted with D2 blockade (Fig 6F; PC1 scores for D2 blockade: -0.6 (-3.8 – 4.7) vs saline: -2.3 (-4.2 – 3.2), F = 5.1, p = 0.03 accounting for variance between MSNs; no reliable effect of sex (F = 0.2, p = 0.63) or switching direction (F = 2.8, p = 0.10)). PC1 scores also shifted with D1 blockade (Fig 6F; PC1 scores for D1 blockade: -0.0 (-3.9 – 4.5), F = 5.8, p = 0.02 accounting for variance between MSNs; no reliable effect of sex (F = 0.0, p = 0.93) or switching direction (F = 0.9, p = 0.34)). There were no reliable differences in PC1 scores between D2 and D1 blockade. Furthermore, PC1 was distinct even when sessions were sorted independently and assumed to be fully statistically independent (Figure S10; D2 blockade vs saline: F = 5.8, p = 0.02; D1 blockade vs saline: F = 4.9, p = 0.03; all analyses accounting for variance between mice). Higher components explained less variance and were not reliably different between saline and D2 blockade or D1 blockade. Taken together, this data-driven analysis shows that D2 and D1 blockade produced similar shifts in MSN population dynamics represented by PC1. When combined with the major contributions of D1/D2 MSNs to PC1 (Fig 3C) these findings indicate that pharmacological D2 blockade and D1 blockade disrupt ramping-related activity in the striatum.”

      Finally, we included the data in which sessions were sorted independently and assumed to be fully statistically independent in a new Figure S10.

      And in the results (Line 376): 

      “Furthermore, PC1 was distinct even when sessions were sorted independently and assumed to be fully statistically independent (Figure S10; D2 blockade vs saline: F = 5.8, p = 0.02; D1 blockade vs saline: F = 4.9, p = 0.03; all analyses accounting for variance between mice). Higher components explained less variance and were not reliably different between saline and D2 blockade or D1 blockade.”

      These changes strengthen the manuscript and better show the main effects and variability of the data. 

      Regarding the results in Figure 7: 

      I am overall a bit confused about what the authors are trying to claim here. In Figure 7, they present data suggesting that D1 or D2 blockade disrupts their ability to decode time in the interval of interest (0-6 seconds). However, in the final paragraph of the results, the authors seem to say that by using another technique, they didn't see any significant change in decoding accuracy after D1 or D2 blockade. What do the authors make of this? 

      This was very unclear. The second classifier was predicting response time, but it was confusing, and we removed it. 

      Impact: 

      The task and data presented by the authors are very intriguing, and there are many groups interested in how striatal activity contributes to the neural perception of time. The authors perform a wide variety of experiments and analysis to examine how DMS activity influences time perception during an interval-timing task, allowing for insight into this process. However, the significance of the key finding - that D2/D1 activity increases/ decreases with time - remains somewhat ambiguous to me. This arises from a lack of clarity regarding the initial hypothesis and the implications of this finding for advancing our understanding of striatal functions. 

      As noted above, we clarified our hypothesis and implications, and strengthened several aspects of the data as suggested by this reviewer.  

      Reviewer #2 (Public Review): 

      Summary: 

      In the present study, the authors investigated the neural coding mechanisms for D1- and D2expressing striatal direct and indirect pathway MSNs in interval timing by using multiple strategies. They concluded that D2-MSNs and D1-MSNs have opposing temporal dynamics yet disrupting either type produced similar effects on behavior, indicating the complementary roles of D1- and D2- MSNs in cognitive processing. However, the data was incomplete to fully support this major finding. One major reason is the heterogenetic responses within the D1-or D2MSN populations. In addition, there are additional concerns about the statistical methods used. For example, the majority of the statistical tests are based on the number of neurons, but not the number of mice. It appears that the statistical difference was due to the large sample size they used (n=32 D2-MSNs and n=41 D1-MSNs), but different neurons recorded in the same mouse cannot be treated as independent samples; they should use independent mouse-based statistical analysis. 

      Strengths: 

      The authors used multiple approaches including awake mice behavior training, optogeneticassistant cell-type specific recording, optogenetic or pharmacological manipulation, neural computation, and modeling to study neuronal coding for interval timing. 

      We appreciate the reviewer’s careful read recognizing the breadth of our approach.  

      Weaknesses: 

      (1) More detailed behavior results should be shown, including the rate of the success switches, and how long it takes to wait in the second nose poke to get a reward. For line 512 and the Figure 1 legend, the reviewer is not clear about the reward delivery. The methods appear to state that the mouse had to wait for 18s, then make nose pokes at the second port to get the reward. What happens if the mouse made the second nose poke before 18 seconds, but then exited? Would the mouse still get the reward at 18 seconds? Similarly, what happens if the mice made the third or more nosepokes within 18 seconds? It is important to clarify because, according to the method described, if the mice made a second nose poke before 18 seconds, this already counted as the mouse making the "switch." Lastly, what if the mice exited before 6s in the first nosepoke? 

      We completely agree. We have now completely revised Figure 1 to include many of these task details.

      We have clarified remaining details in the methods (Line 548):

      “Interval timing switch task. We used a mouse-optimized operant interval timing task described in detail previously (Balci et al., 2008; Bruce et al., 2021; Tosun et al., 2016; Weber et al., 2023). Briefly, mice were trained in sound-attenuating operant chambers, with two front nosepokes flanking either side of a food hopper on the front wall, and a third nosepoke located at the center of the back wall. The chamber was positioned below an 8-kHz, 72-dB speaker (Fig 1A; MedAssociates, St. Albans, VT). Mice were 85% food restricted and motivated with 20 mg sucrose pellets (BioServ, Flemington, NJ). Mice were initially trained to receive rewards during fixed ratio nosepoke response trials. Nosepoke entry and exit were captured by infrared beams. After shaping, mice were trained in the “switch” interval timing task. Mice self-initiated trials at the back nosepoke, after which tone and nosepoke lights were illuminated simultaneously. Cues were identical on all trial types and lasted the entire duration of the trial (6 or 18 seconds). On 50% of trials, mice were rewarded for a nosepoke after 6 seconds at the designated first ‘front’ nosepoke; these trials were not analyzed. On the remaining 50% of trials, mice were rewarded for nosepoking first at the ‘first’ nosepoke location and then switching to the ‘second’ nosepoke location; the reward was delivered for initial nosepokes at the second nosepoke location after 18 seconds when preceded by a nosepoke at the first nosepoke location.  Multiple nosepokes at each nosepokes were allowed. Early responses at the first or second nosepoke were not reinforced. Initial responses at the second nosepoke rather than the first nosepoke, alternating between nosepokes, going back to the first nosepoke after the second nosepoke were rare after initial training. Error trials included trials where animals responded only at the first or second nosepoke and were also not reinforced. We did not analyze error trials as they were often too few to analyze; these were analyzed at length in our prior work (Bruce et al., 2021).

      Switch response time was defined as the moment animals departed the first nosepoke before arriving at the second nosepoke. Critically, switch responses are a time-based decision guided by temporal control of action because mice switch nosepokes only if nosepokes at the first location did not receive a reward after 6 seconds. That is, mice estimate if more than 6 seconds have elapsed without receiving a reward to decide to switch responses. Mice learn this task quickly (3-4 weeks), and error trials in which an animal nosepokes in the wrong order or does not nosepoke are relatively rare and discarded. Consequently, we focused on these switch response times as the key metric for temporal control of action. Traversal time was defined as the duration between first nosepoke exit and second nosepoke entry and is distinct from switch response time when animals departed the first nosepoke. Nosepoke duration was defined as the time between first nosepoke entry and exit for the switch response times only. Trials were self-initiated, but there was an intertrial interval with a geometric mean of 30 seconds between trials.”

      And in the results on Line 131: 

      “We investigated cognitive processing in the striatum using a well-described mouseoptimized interval timing task which requires mice to respond by switching between two nosepokes after a 6-second interval (Fig 1A; see Methods; (Balci et al., 2008; Bruce et al., 2021; Larson et al., 2022; Tosun et al., 2016; Weber et al., 2023)). In this task, mice initiate trials by responding at a back nosepoke, which triggers auditory and visual cues for the duration of the trial. On 50% of trials, mice were rewarded for nosepoking after 6 seconds at the designated ‘first’ front nosepoke; these trials were not analyzed. On the remaining 50% of trials, mice were rewarded for nosepoking at the ‘first’ nosepoke and then switching to the ‘second’ nosepoke; initial nosepokes at the second nosepoke after 18 seconds triggered reward when preceded by a first nosepoke. The first nosepokes occurred before switching responses and the second nosepokes occurred much later in the interval in anticipation of reward delivery at 18 seconds (Fig 1B-D). During the task, movement velocity peaked before 6 seconds as mice traveled to the front nosepoke (Fig 1E).

      We focused on the switch response time, defined as the moment mice exited the first nosepoke before entering the second nosepoke. Switch responses are a timebased decision guided by temporal control of action because mice switch nosepokes only if nosepoking at the first nosepokes does not lead to a reward after 6 seconds (Fig 1B-E). Switch responses are guided by internal estimates of time because no external cue indicates when to switch from the first to the second nosepoke (Balci et al., 2008; Bruce et al., 2021; Tosun et al., 2016; Weber et al., 2023). We defined the first 6 seconds after trial start as the ‘interval’, because during this epoch mice are estimating whether 6 seconds have elapsed and if they need to switch responses. In 30 mice, switch response times were 9.3 seconds (8.4 – 9.7; median (IQR)); see Table 1 for a summary of mice, experiments, trials, and sessions). We studied dorsomedial striatal D2-MSNs and D1-MSNs using a combination of optogenetics and neuronal ensemble recordings in 9 transgenic mice (4 D2-Cre mice switch response time 9.7 (7.0 – 10.3) seconds; 5 D1-Cre mice switch response time 8.2 (7.7 – 8.7) seconds; rank sum p = 0.73; Table 1).”

      (2) There are a lot of time parameters in this behavior task, the description of those time parameters is mentioned in several parts, in the figure legend, supplementary figure legend, and methods, but was not defined clearly in the main text. It is inconvenient, sometimes, confusing for the readers. The authors should make a schematic diagram to illustrate the major parameters and describe them clearly in the main text. 

      We agree. We have clarified this in a new schematic, shading the interval in gray:   

      And in the results on line 131:

      “We focused on the switch response time, defined as the moment mice exited the first nosepoke before entering the second nosepoke. Switch responses are a time-based decision guided by temporal control of action because mice switch nosepokes only if nosepoking at the first nosepokes does not lead to a reward after 6 seconds (Fig 1BE). Switch responses are guided by internal estimates of time because no external cue indicates when to switch from the first to the second nosepoke (Balci et al., 2008; Bruce et al., 2021; Tosun et al., 2016; Weber et al., 2023). We defined the first 6 seconds after trial start as the ‘interval’, because during this epoch mice are estimating whether 6 seconds have elapsed and if they need to switch responses. In 30 mice, switch response times were 9.3 seconds (8.4 – 9.7; median (IQR)); see Table 1 for a summary of mice, experiments, trials, and sessions). We studied dorsomedial striatal D2-MSNs and D1-MSNs using a combination of optogenetics and neuronal ensemble recordings in 9 transgenic mice (4 D2-Cre mice switch response time 9.7

      (7.0 – 10.3) seconds; 5 D1-Cre mice switch response time 8.2 (7.7 – 8.7) seconds; rank sum p = 0.73; Table 1).”

      (3) In Line 508, the reviewer suggests the authors pay attention to those trials without "switch". It would be valuable to compare the MSN activity between those trials with or without a "switch". 

      This is a great suggestion. We analyzed such error trials and MSN activity in Figure 6 of Bruce et al., 2021. However, this manuscript was not designed to analyze errors, as they are rare beyond initial training (Bruce et al., 2021 focused on early training), and too inconsistent to permit robust analysis. This was added to the methods on Line 567:

      “Early responses at the first or second nosepoke were not reinforced. Initial responses at the second nosepoke rather than the first nosepoke, alternating between nosepokes, going back to the first nosepoke after the second nosepoke were rare after initial training. Error trials included trials where animals responded only at the first or second nosepoke and were also not reinforced. We did not analyze error trials as they were often too few to analyze; these were analyzed at length in our prior work (Bruce et al., 2021).”

      (4) The definition of interval is not very clear. It appears that the authors used a 6-second interval in analyzing the data in Figure 2 and Figure 3. But from my understanding, the interval should be the time from time "0" to the "switch", when the mice start to exit from the first nose poke. 

      We have now defined it explicitly in the schematic: 

      Incidentally, this reviewer asked us to analyze a longer epoch – this analysis beautifully justifies our focus on the first 6 seconds (now in Figure S2).

      We focus on the first six seconds as there are few nosepokes and switch responses during this epoch; however, we consider the reviewer’s definition and analyze the epoch the reviewer suggests from 0 to the switch in analyses below. 

      (5) For Figure 2 C-F, the authors only recorded 32 D2-MSNs in 4 mice, and 41 D1-MSNs in 5 mice. The sample size is too small compared to the sample size usually used in the field. In addition to the small sample size, the single-cell activity exhibited heterogeneity, which created potential issues. 

      We are glad the reviewer raised these points. First, our tagging dataset is relatively standard for optogenetic tagging. Second, we now include Cohen’s d for both PC and slope results for all optogenetic tagging analysis, which demonstrate that we have adequate statistical power and medium-to-large effect sizes (Line 186): 

      “In line with population averages from Fig 2G&H, D2-MSNs and D1-MSNs had opposite patterns of activity with negative PC1 scores for D2-MSNs and positive PC1 scores for D1-MSNs (Fig 3C; PC1 for D2-MSNs: -3.4 (-4.6 – 2.5); PC1 for D1MSNs: 2.8 (-2.8 – 4.9); F = 8.8, p = 0.004 accounting for variance between mice (Fig S3A); Cohen’s d = 0.7; power = 0.80; no reliable effect of sex (F = 0.44, p = 0.51) or switching direction (F = 1.73, p = 0.19)).”

      And Line 197:

      “GLM analysis also demonstrated that D2-MSNs had significantly different slopes (0.01 spikes/second (-0.10 – 0.10)), which were distinct from D1-MSNs (-0.20 (-0.47– 0.06; Fig 3D; F = 8.9, p = 0.004 accounting for variance between mice (Fig S3B); Cohen’s d = 0.8; power = 0.98; no reliable effect of sex (F = 0.02, p = 0.88) or switching direction (F = 1.72, p = 0.19)).”

      We added boxplots to Figure 3, which better highlight differences in these distributions.

      However, the reviewer’s point is well-taken, and we have added a caveat to the discussion exactly as the reviewer suggested (Line 496):

      “Second, although we had adequate statistical power and medium-to-large effect sizes, optogenetic tagging is low-yield, and it is possible that recording more of these neurons would afford greater opportunity to identify more robust results and alternative coding schemes, such as neuronal synchrony.”

      For both D1 and D2 MSNs, the authors tried to make conclusions on the "trend" of increasing in D2-MSNs and decreasing in D1-MSNs populations, respectively, during the interval. However, such a conclusion is not sufficiently supported by the data presented. It looks like the single-cell activity patterns can be separated into groups: one is a decreasing activity group, one is an increasing activity group and a small group for on and off response. Because of the small sample size, the author should pay attention to the variance across different mice (which needs to be clearly presented in the manuscript), instead of pooling data together and analyzing the mean activity. 

      We were not clear – we now do exactly as the reviewer suggested. We are not pooling any data – instead – as we state on line 620 - we are using linear-mixed effects models to account for mouse-specific and neuron-specific variance. This approach was developed with our statistics core for exactly the reasons the reviewer suggested (see letter). We state this explicitly in the methods (Line 704):

      “Statistics. All data and statistical approaches were reviewed by the Biostatistics,

      Epidemiology, and Research Design Core (BERD) at the Institute for Clinical and Translational Sciences (ICTS) at the University of Iowa. All code and data are made available at http://narayanan.lab.uiowa.edu/article/datasets. We used the median to measure central tendency and the interquartile range to measure spread. We used Wilcoxon nonparametric tests to compare behavior between experimental conditions and Cohen’s d to calculate effect size. Analyses of putative single-unit activity and basic physiological properties were carried out using custom routines for MATLAB.

      For all neuronal analyses, variability between animals was accounted for using generalized linear-mixed effects models and incorporating a random effect for each mouse into the model, which allows us to account for inherent between-mouse variability. We used fitglme in MATLAB and verified main effects using lmer in R. We accounted for variability between MSNs in pharmacological datasets in which we could match MSNs between saline, D2 blockade, and D1 blockade. P values < 0.05 were interpreted as significant.”

      We have now stated in the results that we are explicitly accounting for variance between mice (Line 186): 

      “In line with population averages from Fig 2G&H, D2-MSNs and D1-MSNs had opposite patterns of activity with negative PC1 scores for D2-MSNs and positive PC1 scores for D1-MSNs (Fig 3C; PC1 for D2-MSNs: -3.4 (-4.6 – 2.5); PC1 for D1MSNs: 2.8 (-2.8 – 4.9); F = 8.8, p = 0.004 accounting for variance between mice (Fig S3A); Cohen’s d = 0.7; power = 0.80; no reliable effect of sex (F = 0.44, p = 0.51) or switching direction (F = 1.73, p = 0.19)).”

      And on Line 197:

      “GLM analysis also demonstrated that D2-MSNs had significantly different slopes (0.01 spikes/second (-0.10 – 0.10)), which were distinct from D1-MSNs (-0.20 (-0.47– 0.06; Fig 3D; F = 8.9, p = 0.004 accounting for variance between mice (Fig S3B); Cohen’s d = 0.8; power = 0.98; no reliable effect of sex (F = 0.02, p = 0.88) or switching direction (F = 1.72, p = 0.19)).”

      All statistics in the manuscript now explicitly account for variance between mice. 

      This is the approach that was recommended by our the Biostatistics, Epidemiology, and

      Research Design Core (BERD) at the Institute for Clinical and Translational Sciences (ICTS) at the University of Iowa, who reviews all of our work.

      We note that these Cohen d values usually interpret as medium or large. 

      We performed statistical power calculations and include these to aid readers’ interpretation. These are all >0.8. 

      Finally, the reviewer uses the word ‘trend’. We define p values <0.05 as significant in the methods, and do not interpret trends (on line 717): 

      “P values < 0.05 were interpreted as significant.”

      And, we have now plotted values for each mouse in a new Figure S3.

      As noted in the figure legend, mouse-specific effects were analyzed using linear models that account for between-mouse variability, as discussed with our statisticians. However, the reviewer’s point is well taken, and we have added this idea to the discussion as suggested (Line 496):

      “Second, although we had adequate statistical power and medium-to-large effect sizes, optogenetic tagging is low-yield, and it is possible that recording more of these neurons would afford greater opportunity to identify more robust results and alternative coding schemes, such as neuronal synchrony.”

      (6) For Figure 2, from the activity in E and F, it seems that the activity already rose before the trial started, the authors should add some longer baseline data before time zero for clarification and comparison and show the timing of the actual start of the activity with the corresponding behavior. What behavior states are the mice in when initiating the activity? 

      This is a key point. First, we are not certain what state the animal is in until they initiate trials at the back nosepoke (“Start”). Therefore, we cannot analyze this epoch.  

      However, we can show neuronal activity during a longer epoch exactly as the reviewer suggested. Although there are modulations, the biggest difference between D2 and D1 MSNs is during the 0-6 second interval. This analysis supports our focus on the 0-6 second interval. We have included this as a new Figure S2.

      (7) The authors were focused on the "switch " behavior in the task, but they used an arbitrary 6s time window to analyze the activity, and tried to correlate the decreasing or increasing activities of MSNs to the neural coding for time. A better way to analyze is to sort the activity according to the "switch" time, from short to long intervals. This way, the authors could see and analyze whether the activity of D1 or D2 MSNs really codes for the different length of interval, instead of finding a correlation between average activity trends and the arbitrary 6s time window. 

      This is a great suggestion. We did exactly this and adjusted our linear models on a trialby-trial basis to account for time between the start of the interval and the switch. This is now added to the methods (line 656): 

      “We performed additional sensitivity analysis excluding outliers and measuring firing rate from the start of the interval to the time of the switch response on a trialby-trial level for each neuron.”

      And to the results (Line 201):

      “We found that D2-MSNs and D1-MSNs had a significantly different slope even when excluding outliers (4 outliers excluded outside of 95% confidence intervals; F=7.51, p=0.008 accounting for variance between mice) and when the interval was defined as the time between trial start and the switch response on a trial-by-trial basis for each neuron (F=4.3, p=0.04 accounting for variance between mice).”

      We now state our justification for focusing on the first 6 seconds of the interval (Line 134)

      “Switch responses are guided by internal estimates of time and temporal control of action because no external cue indicates when to switch from the first to the second nosepoke (Balci et al., 2008; Bruce et al., 2021; Tosun et al., 2016; Weber et al., 2023). We defined the first 6 seconds after trial start as the ‘interval’, because during this epoch mice are estimating whether 6 seconds have elapsed and if they need to switch responses.”

      As noted previously, epoch is now justified by Figure S2E.

      And we note that this focus minimizes motor confounds (Line 511):

      “Four lines of evidence argue that our findings cannot be directly explained by motor confounds: 1) D2-MSNs and D1-MSNs diverge between 0-6 seconds after trial start well before the first nosepoke (Fig S2), 2) our GLM accounted for nosepokes and nosepoke-related βs were similar between D2-MSNs and D1-MSNs, 3) optogenetic disruption of dorsomedial D2-MSNs and D1-MSNs did not change task-specific movements despite reliable changes in switch response time, and 4) ramping dynamics were quite distinct from movement dynamics. Furthermore, disrupting D2-MSNs and D1-MSNs did not change the number of rewards animals received, implying that these disruptions did not grossly affect motivation. Still, future work combining motion tracking with neuronal ensemble recording and optogenetics and including bisection tasks may further unravel timing vs. movement in MSN dynamics (Robbe, 2023).”

      We are glad the reviewer suggested this analysis as it strengthens our manuscript.  

      Reviewer #3 (Public Review): 

      Summary: 

      The cognitive striatum, also known as the dorsomedial striatum, receives input from brain regions involved in high-level cognition and plays a crucial role in processing cognitive information. However, despite its importance, the extent to which different projection pathways of the striatum contribute to this information processing remains unclear. In this paper, Bruce et al. conducted a study using a range of causal and correlational techniques to investigate how these pathways collectively contribute to interval timing in mice. Their results were consistent with previous research, showing that the direct and indirect striatal pathways perform opposing roles in processing elapsed time. Based on their findings, the authors proposed a revised computational model in which two separate accumulators track evidence for elapsed time in opposing directions. These results have significant implications for understanding the neural mechanisms underlying cognitive impairment in neurological and psychiatric disorders, as disruptions in the balance between direct and indirect pathway activity are commonly observed in such conditions. 

      Strengths: 

      The authors employed a well-established approach to study interval timing and employed optogenetic tagging to observe the behavior of specific cell types in the striatum. Additionally, the authors utilized two complementary techniques to assess the impact of manipulating the activity of these pathways on behavior. Finally, the authors utilized their experimental findings to enhance the theoretical comprehension of interval timing using a computational model. 

      We are grateful for the reviewer’s consideration of our work and for recognizing the strengths of our approach.  

      Weaknesses: 

      The behavioral task used in this study is best suited for investigating elapsed time perception, rather than interval timing. Timing bisection tasks are often employed to study interval timing in humans and animals.

      This is a key point, and the reviewer is correct. We use our task because of its’ translational validity; as far as we know, temporal bisection tasks have been used less often in human disease and in rodent models. We have included a new paragraph describing this in the discussion (Line 472):

      “Because interval timing is reliably disrupted in human diseases of the striatum such as Huntington’s disease, Parkinson’s disease, and schizophrenia (Hinton et al., 2007; Singh et al., 2021; Ward et al., 2011), these results have relevance to human disease. Our task version has been used extensively to study interval timing in mice and humans (Balci et al., 2008; Bruce et al., 2021; Stutt et al., 2024; Tosun et al., 2016; Weber et al., 2023). However, temporal bisection tasks, in which animals hold during a temporal cue and respond at different locations depending on cue length, have advantages in studying how animals time an interval because animals are not moving while estimating cue duration (Paton and Buonomano, 2018; Robbe, 2023; Soares et al., 2016). Our interval timing task version – in which mice switch between two response nosepokes to indicate their interval estimate has elapsed – has been used extensively in rodent models of neurodegenerative disease (Larson et al., 2022; Weber et al., 2024, 2023; Zhang et al., 2021), as well as in humans (Stutt et al., 2024). Furthermore, because many therapeutics targeting dopamine receptors are used clinically, these findings help describe how dopaminergic drugs might affect cognitive function and dysfunction. Future studies of D2-MSNs and D1-MSNs in temporal bisection and other timing tasks may further clarify the relative roles of D2- and D1-MSNs in interval timing and time estimation.”

      Furthermore, we have modified the use of the definition of interval timing in the abstract, introduction, and results to reflect the reviewers comment. For instance, in the abstract (Line 43):

      “We studied dorsomedial striatal cognitive processing during interval timing, an elementary cognitive task that requires mice to estimate intervals of several seconds and involves working memory for temporal rules as well as attention to the passage of time.”

      However, we think it is important to use the term ‘interval timing’ as it links to past work by our group and others.   

      The main results from unit recording (opposing slopes of D1/D2 cell firing rate, as shown in Figure 3D) appear to be very sensitive to a couple of outlier cells, and the predictive power of ensemble recording seems to be only slightly above chance levels. 

      This is a key point raised by other reviewers as well. We have now included measures of statistical power (as we interpret the reviewer’s comment of predictive power), effect size, and perform additional sensitivity analyses (Line 187): 

      “PC1 scores for D1-MSNs (Fig 3C; PC1 for D2-MSNs: -3.4 (-4.6 – 2.5); PC1 for D1MSNs: 2.8 (-4.9 – -2.8); F=8.8, p = 0.004 accounting for variance between mice (Fig S3A);  Cohen’s d = 0.7; power = 0.80; no reliable effect of sex (F=1.9, p=0.17) or switching direction (F=0.1, p=0.75)).”

      And on Line 197:

      “GLM analysis also demonstrated that D2-MSNs had significantly different slopes (0.01 spikes/second (-0.10 – 0.10)), which were distinct from D1-MSNs (-0.20 (-0.45– 0.06; Fig 3D; F=8.9, p = 0.004 accounting for variance between mice (Fig S3B); Cohen’s d = 0.8; power = 0.98).  We found that D2-MSNs and D1-MSNs had a significantly different slope even when excluding outliers (4 outliers excluded outside of 95% confidence intervals; F=7.51, p=0.008 accounting for variance between mice) and when the interval was defined as the time between trial start and the switch response on a trial-by-trial basis for each neuron (F=4.3, p=0.04 accounting for variance between mice).”

      These are medium-to-large Cohen’s d results, and we have adequate statistical power. These results are not easily explained by chance. 

      We also added boxplots, which highlight the differences in distribution.

      Finally, we note that our conclusions are drawn from many convergent analyses (on Line 216): 

      “Analyses of average activity, PC1, and trial-by-trial firing-rate slopes over the interval provide convergent evidence that D2-MSNs and D1-MSNs had distinct and opposing dynamics during interval timing.”

      In the optogenetic experiment, the laser was kept on for too long (18 seconds) at high power (12 mW). This has been shown to cause adverse effects on population activity (for example, through heating the tissue) that are not necessarily related to their function during the task epochs. 

      This is an important point. We are well aware of heating effects with optogenetics and other potential confounds. For the exact reasons noted by the reviewer, we had opsinnegative controls – where the laser was on for the exact same amount of time (18 seconds) and at the same power (12 mW)– in Figure S5. We have now better highlighted these controls in the methods (Line 598):

      “In animals injected with optogenetic viruses, optical inhibition was delivered via bilateral patch cables for the entire trial duration of 18 seconds via 589-nm laser light at 12 mW power on 50% of randomly assigned trials. We performed control experiments in mice without opsins using identical laser parameters in D2-cre or D1-cre mice (Fig S6).”

      And in results (Line 298):

      “Importantly, we found no reliable effects for D2-MSNs with opsin-negative controls (Fig S6).”

      And Line 306): 

      “As with D2-MSNs, we found no reliable effects with opsin-negative controls in D1MSNs (Fig S6).”

      We have highlighted these data in Figure S6: 

      Furthermore, the effect of optogenetic inhibition is similar to pharmacological effects in this manuscript and in our prior work (De Corte et al., 2019; Stutt et al., 2024) on line 459): 

      “Past pharmacological work from our group and others has shown that disrupting D2- or D1-MSNs slows timing (De Corte et al., 2019b; Drew et al., 2007, 2003; Stutt et al., 2024), in line with pharmacological and optogenetic results in this manuscript.”

      And in the discussion section on Line 488: 

      “Our approach has several limitations. First, systemic drug injections block D2- and D1-receptors in many different brain regions, including the frontal cortex, which is involved in interval timing (Kim et al., 2017a). D2 blockade or D1 blockade may have complex effects, including corticostriatal or network effects that contribute to changes in D2-MSN or D1-MSN ensemble activity. We note that optogenetic inhibition of D2-MSNs and D1-MSNs produces similar effects to pharmacology in Figure 5.”

      Given the systemic delivery of pharmacological interventions, it is difficult to conclude that the effects are specific to the dorsomedial striatum. Future studies should use the local infusion of drugs into the dorsomedial striatum. 

      This is a great point - we did this experiment in De Corte et al, 2019 with local drug infusions. This earlier study was the departure point for this experiment. We now point this out in the introduction (Line 92): 

      “Past work has shown that disrupting either D2-dopamine receptors (D2) or D1dopamine receptors (D1) powerfully impairs interval timing by increasing estimates of elapsed time (Drew et al., 2007; Meck, 2006). Similar behavioral effects were found with systemic (Stutt et al., 2024) or local dorsomedial striatal D2 or D1 disruption (De Corte et al., 2019a). These data lead to the hypothesis that D2 MSNs and D1 MSNs have similar patterns of ramping activity across a temporal interval.”

      However, the reviewer makes a great point - and we will develop this in our future work (Line 485): 

      “Future studies might extend our work combining local pharmacology with neuronal ensemble recording.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Just a few minor notes: 

      (1) Figures 2C and D should have error bars. 

      We agree.  We added error bars to these figures and other rasters as recommended.  

      (2) Figures 2G and H seem to be smoothed - how was this done? 

      We added these details.

      (3) It is unclear what the 'neural network machine learning classifier' mentioned in lines 193-199 adds if the data relevant to this analysis isn't presented. I would potentially include this. 

      We agree. This analysis was confusing and not relevant to our main points; consequently, we removed it.  

      Reviewer #2 (Recommendations For The Authors): 

      Major: 

      (1)  For Figure 2, the description of the main results in (C-F) in the main text is too brief and is not clear. 

      We have added to and clarified this text (Line 147)

      “Striatal neuronal populations are largely composed of MSNs expressing D2dopamine or D1-dopamine receptors. We optogenetically tagged D2-MSNs and D1MSNs by implanting optrodes in the dorsomedial striatum and conditionally expressing channelrhodopsin (ChR2; Fig S1) in 4 D2-Cre (2 female) and 5 D1-Cre transgenic mice (2 female). This approach expressed ChR2 in D2-MSNs or D1MSNs, respectively (Fig 2A-B; Kim et al., 2017a). We identified D2-MSNs or D1MSNs by their response to brief pulses of 473 nm light; neurons that fired within 5 milliseconds were considered optically tagged putative D2-MSNs (Fig S1B-C). We tagged 32 putative D2-MSNs and 41 putative D1-MSNs in a single recording session during interval timing. There were no consistent differences in overall firing rate between D2-MSNs and D1-MSNs (D2-MSNs: 3.4 (1.4 – 7.2) Hz; D1-MSNs 5.2 (3.1 – 8.6) Hz; F = 2.7, p = 0.11 accounting for variance between mice). Peri-event rasters and histograms from a tagged putative D2-MSN (Fig 2C) and from a tagged putative D1-MSN (Fig 2D) demonstrate prominent modulations for the first 6 seconds of the interval after trial start. Z-scores of average peri-event time histograms (PETHs) from 0 to 6 seconds after trial start for each putative D2-MSN are shown in Fig 2E and for each putative D1-MSN in Fig 2F. These PETHs revealed that for the 6-second interval immediately after trial start, many putative D2-MSN neurons appeared to ramp up while many putative D1-MSNs appeared to ramp down. For 32 putative D2-MSNs average PETH activity increased over the 6second interval immediately after trial start, whereas for 41 putative D1-MSNs, average PETH activity decreased. These differences resulted in distinct activity early in the interval (0-1 seconds; F = 6.0, p = 0.02 accounting for variance between mice), but not late in the interval (5-6 seconds; F = 1.9, p = 0.17 accounting for variance between mice) between D2-MSNs and D1-MSNs. Examination of a longer interval of 10 seconds before to 18 seconds after trial start revealed the greatest separation in D2-MSN and D1-MSN dynamics during the 6-second interval after trial start (Fig S2). Strikingly, these data suggest that D2-MSNs and D1-MSNs might display opposite dynamics during interval timing.”

      (2)  For Figure3 

      (A)  Is the PC1 calculated from all MSNs of all mice (4 D2, 5 D1 mice)? 

      We clarified this (Line 182):

      “We analyzed PCA calculated from all D2-MSNs and D1-MSNs PETHs over the 6second interval immediately after trial start.”

      And for pharmacology (Line 362): 

      “We noticed differences in MSN activity across the interval with D2 blockade and D1 blockade at the individual MSN level (Fig 6B-D) as well as at the population level (Fig 6E). We used PCA to quantify effects of D2 blockade or D1 blockade (Bruce et al., 2021; Emmons et al., 2017; Kim et al., 2017a). We constructed principal components (PC) from z-scored peri-event time histograms of firing rate from saline, D2 blockade, and D1 blockade sessions for all mice together.”

      (B)  The authors should perform PCA on single mouse data, and add the plot and error bar. 

      This is a great idea. We have now included this as a new Figure S3:   

      (C)  As mentioned before, both D2-or D1- MSNs can be divided into three groups, it is not appropriate to put them together as each MSN is not an independent variable, the authors should do the statistics based on the individual mouse, and do the parametric or non-parametric comparison, and plot N (number of mice) based error bars. 

      We have done exactly this using a linear mixed effects model, as recommend by our statistics core. They have explicitly suggested that this is the best approach to these data (see letter). We have also included measures of statistical power and effect size (Line 704):  

      “All data and statistical approaches were reviewed by the Biostatistics, Epidemiology, and Research Design Core (BERD) at the Institute for Clinical and Translational Sciences (ICTS) at the University of Iowa. All code and data are made available at http://narayanan.lab.uiowa.edu/article/datasets. We used the median to measure central tendency and the interquartile range to measure spread. We used Wilcoxon nonparametric tests to compare behavior between experimental conditions and Cohen’s d to calculate effect size. Analyses of putative single-unit activity and basic physiological properties were carried out using custom routines for MATLAB.

      For all neuronal analyses, variability between animals was accounted for using generalized linear-mixed effects models and incorporating a random effect for each mouse into the model, which allows to account for inherent between-mouse variability. We used fitglme in MATLAB and verified main effects using lmer in R. We accounted for variability between MSNs in pharmacological datasets in which we could match MSNs between saline, D2 blockade, and D1 blockade. P values < 0.05 were interpreted as significant.”

      We have now included measures of ‘power’ (which we interpret to be statistical), effect size, and perform additional sensitivity analyses (Line 187): 

      “PC1 scores for D1-MSNs (Fig 3C; PC1 for D2-MSNs: -3.4 (-4.6 – 2.5); PC1 for D1MSNs: 2.8 (-4.9 – -2.8); F=8.8, p = 0.004 accounting for variance between mice (Fig S3A); Cohen’s d = 0.7; power = 0.80; no reliable effect of sex (F=1.9, p=0.17) or switching direction (F=0.1, p=0.75)).”

      And Line 197:

      “GLM analysis also demonstrated that D2-MSNs had significantly different slopes (0.01 spikes/second (-0.10 – 0.10)), which were distinct from D1-MSNs (-0.20 (-0.45– 0.06; Fig 3D; F=8.9, p = 0.004 accounting for variance between mice (Fig S3B); Cohen’s d = 0.8; power = 0.98).  We found that D2-MSNs and D1-MSNs had a significantly different slope even when excluding outliers (4 outliers excluded outside of 95% confidence intervals; F=7.51, p=0.008 accounting for variance between mice) and when the interval was defined as the time between trial start and the switch response on a trial-by-trial bases for each neuron (F=4.3, p=0.04 accounting for variance between mice).”

      These are medium-to-large Cohen’s d results, and we have adequate statistical power. These results are not easily explained by chance. 

      We also added boxplots, which highlight the differences in distributions.

      (3) For results in Figure 5 and Figure S7, according to Figure 1 legend, lines 4 to 5, the response times were defined as the moment mice exit the first nose poke (on the left) to respond at the second nose poke; and according to method session (line 522), "switch" traversal time was defined as the duration between first nose poke exit and second nose poke entry. It seems that response time is the switch traversal time, they should be the same, but in Figures B and D, the response time showed a clear difference between the laser off and on groups, while in Figures S7 C, and G, there were no differences between laser off and on group for switch traversal time. Please reconcile these inconsistencies. 

      We were not clear. We now clarify – switch responses are the moment when mice depart the first nosepoke, whereas traversal time is the time between departing the first nosepoke and arriving at the second nosepoke. We have reworked our figures to make this clear.

      And in the methods (Line 570):

      “Switch response time was defined as the moment animals departed the first nosepoke before arriving at the second nosepoke. Critically, switch responses are a time-based decision guided by temporal control of action because mice switch nosepokes only if nosepokes at the first location did not receive a reward after 6 seconds. That is, mice estimate if more than 6 seconds have elapsed without receiving a reward to decide to switch responses. Mice learn this task quickly (3-4 weeks), and error trials in which an animal nosepokes in the wrong order or does not nosepoke are relatively rare and discarded. Consequently, we focused on these switch response times as the key metric for temporal control of action. Traversal time was defined as the duration between first nosepoke exit and second nosepoke entry and is distinct from switch response time when animals departed the first nosepoke. Nosepoke duration was defined as the time between first nosepoke entry and exit for the switch response times only. Trials were self-initiated, but there was an intertrial interval with a geometric mean of 30 seconds between trials.”

      And in Figure S8, we have added graphics and clarified the legend.

      (4) The first nose poke and second nose poke are very close, why did it take so long to move from the first nose poke to the second nose poke, even though the mouse already made the decision to switch? Please see Figure S1A, it took less than 6s from the back nose poke to the first nose poke, but it took more than 6s (up to 12s) from the first nose poke to the second nose poke, what were the mice's behavior during this period? 

      This is a key detail. There is no temporal urgency as only the initial nosepoke after 18 seconds leads to reward. In other words, making a second nosepoke prior to 18 seconds is not rewarded and, in well-trained animals, is wasted effort. We have added these details to the methods (Line 124):

      “On the remaining 50% of trials, mice were rewarded for nosepoking at the ‘first’ nosepoke and then switching to the ‘second’ nosepoke; initial nosepokes at the second nosepoke after 18 seconds triggered reward when preceded by a first nosepoke. The first nosepokes occurred before switching responses and the second nosepokes occurred much later in the interval in anticipation of reward delivery at 18 seconds (Fig 1B-D). During the task, movement velocity peaked before 6 seconds as mice traveled to the front nosepoke (Fig 1E).”

      And in Figure 1, as described in detail above. 

      (5) How many trials did mice perform in one day? How many recordings/day for how many days were performed? 

      These are key details that we have now added to Table 1.

      We have added the number of recording sessions to the methods (Line 603): 

      “For optogenetic tagging, putative D1- and D2-MSNs were optically identified via 473-nm photostimulation. Units with mean post-stimulation spike latencies of ≤5 milliseconds and a stimulated-to-unstimulated waveform correlation ratio of >0.9 were classified as putative D2-MSNs or D1-MSNs (Ryan et al., 2018; Shin et al., 2018). Only one recording session was performed for each animal per day, and one recording session was included from each animal.”

      And Line 606: 

      “Only one recording session was performed for each animal per day, and one recording session was included from saline, D2 blockade, and D1 blockade sessions.”

      (6) For results in Figure 5, the authors should analyze the speed for the laser on and off group, since the dorsomedial striatum was reported to be related to control of speed (Yttri, Eric A., and Joshua T. Dudman. "Opponent and bidirectional control of movement velocity in the basal ganglia." Nature 533.7603 (2016): 402-406.). 

      We have some initial DeepLabCut data and have included it in a new Figure 1E.

      B) DeepLabCut tracking of position during the interval timing revealed that mice moved quickly after trial start and then velocity was relatively constant throughout the trial

      We measure movement speed using nosepoke duration and traversal time, which can give some measure of movement velocity.

      In Yttri and Dudman, the mice are head-fixed and moving a joystick, whereas our mice are freely moving. However, we have now included the lack of motor control as a major limitation (Line 510): 

      “Finally, movement and motivation contribute to MSN dynamics (Robbe, 2023). Four lines of evidence argue that our findings cannot be directly explained by motor confounds: 1) D2-MSNs and D1-MSNs diverge between 0-6 seconds after trial start well before the first nosepoke (Fig S2), 2) our GLM accounted for nosepokes and nosepoke-related βs were similar between D2-MSNs and D1-MSNs, 3) optogenetic disruption of dorsomedial D2-MSNs and D1-MSNs did not change task-specific movements despite reliable changes in switch response time, and 4) ramping dynamics were quite distinct from movement dynamics. Furthermore, disrupting D2-MSNs and D1-MSNs did not change the number of rewards animals received, implying that these disruptions did not grossly affect motivation. Still, future work combining motion tracking with neuronal ensemble recording and optogenetics and including bisection tasks may further unravel timing vs. movement in MSN dynamics (Robbe, 2023).”

      (7)  Figure S3 (C, E, and F), statistics should be done based on N (number of mice), not on the number of recorded neurons.  

      We have removed this section, and all other statistics in the paper properly account for mouse-specific variance, as noted above.

      (8)  Figure S1 

      (A) Are these the results from all mice superposed together, or from one mouse on one given day? How many of the trials' data were superposed?

      We included these details in a new Figure 1.

      (B, C) How many trials were included? 

      (D) How many days did these data cover? 

      We have included a new Table 1 with these important details.

      We have noted that only 1 recording session / mouse was included in analysis (Line 606):

      “Only one recording session was performed for each animal per day, and one recording session was included from each animal.”

      And Line 614: 

      “Only one recording session was performed for each animal per day, and one recording session was included from saline, D2 blockade, and D1 blockade sessions.”

      (9) Figure S2 

      (A) Can the authors add coordinates of the brain according to the mouse brain atlas or, alternatively, show it using a coronal section? 

      Great idea – added to Figure S2 legend: 

      “Figure S1: A) Recording locations in the dorsomedial striatum (targeting AP +0.4, ML -1.4, DV -2.7). Electrode reconstructions for D2-Cre (red), D1-Cre (blue), and wild-type mice (green). Only the left striatum was implanted with electrodes in all animals.”

      We have also added it to Figure S5 legend: 

      “Figure S5: Fiber optic locations from A) an opsin-expressing mouse with mCherrytagged halorhodopsin and bilateral fiber optics, and B) across 10 D2-Cre mice (red) and 6 D1-cre mice (blue) with fiber optics (targeting AP +0.9, ML +/-1.3, DV –2.5).”

      (C) Why did the waveform of laser and no laser seem the same? 

      The optogenetically tagged spike waveforms are highly similar, indicating that optogenetically-triggered spikes are like other spikes. That is the main point – optogenetically stimulating the neuron does not change the waveform. We have added this detail to the legend of S1: 

      “Inset on bottom right – waveforms from laser trials (red) and trials without laser (blue).  Across 73 tagged neurons, waveform correlation coefficients for laser trials vs. trials without laser was r = 0.97 (0.92-0.99). These data demonstrate that optogenetically triggered spikes are similar to non-optogenetically triggered spikes.”

      (10)  Figure S7, what was the laser power used in this experiment? Have the authors tried different laser powers? 

      We have now clarified the laser power on line 598: 

      “In animals injected with optogenetic viruses, optical inhibition was delivered via bilateral patch cables for the entire trial duration of 18 seconds via 589-nm laser light at 12 mW power on 50% of randomly assigned trials.”

      And for Figure S6 (was S7 previously): 

      We did not try other laser powers; our parameters were chosen a priori based on our past work.  

      (11)  In Figure S9, what method was used to sort the neurons? 

      We now clarify in the methods (Line 617): 

      “Electrophysiology. Single-unit recordings were made using a multi-electrode recording system (Open Ephys, Atlanta, GA). After the experiments, Plexon Offline Sorter (Plexon, Dallas, TX), was used to remove artifacts. Principal component analysis (PCA) and waveform shape were used for spike sorting. Single units were defined as those 1) having a consistent waveform shape, 2) being a separable cluster in PCA space, and 3) having a consistent refractory period of at least 2 milliseconds in interspike interval histograms.  The same MSNs were sorted across saline, D2 blockade, and D1 blockade sessions by loading all sessions simultaneously in Offline Sorter and sorted using the preceding criteria. MSNs had to have consistent firing in all sessions to be included. Sorting integrity across sessions was quantified by comparing waveform similarity via R2 between sessions.”

      And in the results (Line 353):

      “We analyzed 99 MSNs in sessions with saline, D2 blockade, and D1 blockade. We matched MSNs across sessions based on waveform and interspike intervals; waveforms were highly similar across sessions (correlation coefficient between matched MSN waveforms: saline vs D2 blockade r = 1.00 (0.99 – 1.00 rank sum vs correlations in unmatched waveforms p = 3x10-44; waveforms; saline vs D1 blockade r = 1.00 (1.00 – 1.00), rank sum vs correlations in unmatched waveforms p = 4x10-50). There were no consistent changes in MSN average firing rate with D2 blockade or D1 blockade (F = 1.1, p = 0.30 accounting for variance between MSNs; saline: 5.2 (3.3 – 8.6) Hz; D2 blockade 5.1 (2.7 – 8.0) Hz; F = 2.2, p = 0.14; D1 blockade 4.9 (2.4 – 7.8) Hz).”

      (C-F) statistics should be done based on the number of mice, not on the number of recorded neurons. 

      We agree, all experiments are now quantified using linear mixed effects models which formally accounts for variance contributed across animals, as discussed at length earlier in the review and with statistical experts at the University of Iowa.

      (12) For results in Figure 6, did the authors do cell-type specific recording on D1 or D2 MSNs using optogenetic tagging? As the D1- or D2- MSNs account for ~50% of all MSNs, the inhibition of a considerable amount of neurons was not observed. The authors should discuss the relation between the results from optogenetic inhibition of D1- or D2- MSNs and pharmacological disruption of D1 or D2 dopamine receptors. 

      This is a great point. First, we did not combine cell-type specific recordings with tagging as it was difficult to get enough trials for analysis in a single session in the tagging experiments, and pharmacological interventions can further decrease performance.  However, we have made our results in Figure 6 much more focused.

      We have discussed the relationship between these data in the results (Line 380): 

      “This data-driven analysis shows that D2 and D1 blockade produced similar shifts in MSN population dynamics represented by PC1.  When combined with major contributions of D1/D2 MSNs to PC1 (Fig 3C) these findings show that pharmacologically disrupting D2 or D1 MSNs can disrupt ramping-related activity in the striatum.”

      And in the discussion (Line 417): 

      “Strikingly, optogenetic tagging showed that D2-MSNs and D1-MSNs had distinct dynamics during interval timing. MSN dynamics helped construct and constrain a four-parameter drift-diffusion model in which D2- and D1-MSN spiking accumulated temporal evidence. This model predicted that disrupting either D2MSNs or D1-MSNs would increase response times. Accordingly, we found that optogenetically or pharmacologically disrupting striatal D2-MSNs or D1-MSNs increased response times without affecting task-specific movements. Disrupting D2MSNs or D1-MSNs shifted MSN temporal dynamics and degraded MSN temporal encoding. These data, when combined with our model predictions, demonstrate that D2-MSNs and D1-MSNs contribute temporal evidence to controlling actions in time.”

      And: 

      “D2-MSNs and D1-MSNs play complementary roles in movement. For instance, stimulating D1-MSNs facilitates movement, whereas stimulating D2-MSNs impairs movement (Kravitz et al., 2010). Both populations have been shown to have complementary patterns of activity during movements (Tecuapetla et al., 2016), with MSNs firing at different phases of action initiation and selection. Further dissection of action selection programs reveals that opposing patterns of activation among D2MSNs and D1-MSNs suppress and guide actions, respectively, in the dorsolateral striatum (Cruz et al., 2022). A particular advantage of interval timing is that it captures a cognitive behavior within a single dimension — time. When projected along the temporal dimension, it was surprising that D2-MSNs and D1-MSNs had opposing patterns of activity. Past pharmacological work from our group and others have shown that disrupting D2 or D1 MSNs slows timing (De Corte et al., 2019; Drew et al., 2007, 2003; Stutt et al., 2023), in line with pharmacological and optogenetic results in this manuscript. Computational modeling predicted that disrupting either D2-MSNs or D1-MSNs increased self-reported estimates of time, which was supported by both optogenetic and pharmacological experiments. Notably, these disruptions are distinct from increased timing variability reported with administrations of amphetamine, ventral tegmental area dopamine neuron lesions, and rodent models of neurodegenerative disease (Balci et al., 2008; Gür et al., 2020, 2019; Larson et al., 2022; Weber et al., 2023). Furthermore, our current data demonstrate that disrupting either D2-MSN or D1-MSN activity shifted MSN dynamics and degraded temporal encoding, supporting prior work (De Corte et al., 2019; Drew et al., 2007, 2003; Stutt et al., 2023). Our recording experiments do not identify where a possible response threshold T is instantiated, but downstream basal ganglia structures may have a key role in setting response thresholds (Toda et al., 2017).”

      (13) For Figure 2, what is the error region for G and H? Is there a statistically significant difference between the start (e.g., 0-1 s) and the end (e.g., 5-6 s) time? 

      G and H are standard error, which we have now clarified.

      And on Line 166: 

      “These differences resulted in distinct activity early in the interval (0-1 seconds; F = 6.0, p = 0.02 accounting for variance between mice), but not late in the interval (5-6 seconds; F = 1.9, p = 0.17 accounting for variance between mice) between D2-MSNs and D1-MSNs.”

      Minor: 

      (1)  Figure 2 legend showed the wrong label "Peri-event raster C) from a D2-MSN (red) and E) from a D1-MSN (blue). It should be (D). 

      Fixed, thank you.  

      (2)  Figure 2. Missing legend for (E) and (F).  

      Fixed, thank you.  

      (3)  Line 423: mistyped "\" 

      Fixed, thank you.  

      Reviewer #3 (Recommendations For The Authors): 

      -  To clarify that complementary means opposing in this context, I suggest changing the title. 

      This is a helpful suggestion. We have changed it exactly as the reviewer suggested: 

      “Complementary opposing D2-MSNs and D1-MSNs dynamics during interval timing”

      -  I recommend adding a supplementary figure to demonstrate all the nose pokes in all trials in a given session. The current figures make it hard to assess the specifics of the behavior. For example, what happens if, in a long-interval trial, the mouse pokes in the second nose poke before 6 seconds? Is that behavior punished? Do they keep alternating between the nose poke or do they stick to one nose poke? 

      We agree. We think this is a main point, and we have now redesigned Figure 1 to describe these details: 

      And added these details to the methods (Line 548): 

      “Interval timing switch task. We used a mouse-optimized operant interval timing task described in detail previously (Balci et al., 2008; Bruce et al., 2021; Tosun et al., 2016; Weber et al., 2023). Briefly, mice were trained in sound-attenuating operant chambers, with two front nosepokes flanking either side of a food hopper on the front wall, and a third nosepoke located at the center of the back wall. The chamber was positioned below an 8-kHz, 72-dB speaker (Fig 1A; MedAssociates, St. Albans, VT). Mice were 85% food restricted and motivated with 20 mg sucrose pellets (BioServ, Flemington, NJ). Mice were initially trained to receive rewards during fixed ratio nosepoke response trials. Nosepoke entry and exit were captured by infrared beams. After shaping, mice were trained in the “switch” interval timing task. Mice self-initiated trials at the back nosepoke, after which tone and nosepoke lights were illuminated simultaneously. Cues were identical on all trial types and lasted the entire duration of the trial (6 or 18 seconds). On 50% of trials, mice were rewarded for a nosepoke after 6 seconds at the designated first ‘front’ nosepoke; these trials were not analyzed. On the remaining 50% of trials, mice were rewarded for nosepoking first at the ‘first’ nosepoke location and then switching to the ‘second’ nosepoke location; the reward was delivered for initial nosepokes at the second nosepoke location after 18 seconds when preceded by a nosepoke at the first nosepoke location.  Multiple nosepokes at each nosepokes were allowed. Early responses at the first or second nosepoke were not reinforced. Initial responses at the second nosepoke rather than the first nosepoke, alternating between nosepokes, going back to the first nosepoke after the second nosepoke were rare after initial training. Error trials included trials where animals responded only at the first or second nosepoke and were also not reinforced. We did not analyze error trials as they were often too few to analyze; these were analyzed at length in our prior work (Bruce et al., 2021).”

      -  Figures 2E and 2F suggest that some D1 cells ramp up during the first 6 seconds, while others ramp down. The same is more or less true for D2s. I wonder if the analysis will lose its significance if the two outlier D1s are excluded from Figure 3D. 

      This is a great idea suggested by multiple reviewers. We repeated this analysis with outliers removed. We used a data-driven approach to remove outliers (Line 656): 

      “We performed additional sensitivity analysis excluding outliers outside of 95% confidence intervals and measuring firing rate from the start of the interval to the time of the switch response on a trial-by-trial level for each neuron.”

      And described these data in the results (Line 201): 

      “We found that D2-MSNs and D1-MSNs had a significantly different slope even when excluding outliers (4 outliers excluded outside of 95% confidence intervals; F=7.51, p=0.008 accounting for variance between mice) and when the interval was defined as the time between trial start and the switch response on a trial-by-trial basis for each neuron (F=4.3, p=0.04 accounting for variance between mice).”

      Finally, we removed the outliers the reviewers alluded to – two D1 MSNs – and found similar results (F=6.59, p=0.01 for main effect of D2 vs. D1 MSNs controlling for between-mouse variability). We elected to include the more data driven approach based on 95% confidence intervals.

    1. eLife assessment

      This valuable study introduces SCellBOW, a novel tool leveraging natural language processing techniques to enhance cell clustering and infer survival risks from single-cell RNA sequencing data. The methodology and results are convincing, demonstrating superior clustering performance and the ability to assign risk scores to cancer cell clusters across multiple datasets. SCellBOW's unique approach promises significant advancements in understanding cancer cell heterogeneity and identifying aggressive cancer cell subgroups.

    2. Reviewer #1 (Public Review):

      Summary:

      This review evaluates the SCellBOW framework, which applies phenotype algebra to obtain vectors from cancer subclusters or user-defined subclusters.

      Strengths:

      SCellBOW employs an innovative application of NLP-inspired techniques to analyze scRNA-seq data, facilitating the identification and visualization of phenotypically divergent cell subpopulations.

      The framework demonstrates robustness in accurately representing various cell types across multiple datasets, highlighting its versatility and utility in different biological contexts.

      By simulating the impact of specific malignant subpopulations on disease prognosis, SCellBOW provides valuable insights into the relative risk and aggressiveness of cancer subpopulations, which is crucial for personalized therapeutic strategies.

      The identification of a previously unknown and aggressive AR−/NElow subpopulation in metastatic prostate cancer underscores the potential of SCellBOW in uncovering clinically significant findings.

      Weaknesses:

      The reliance on bulk RNA-seq data as a reference raises concerns about potentially misleading results due to the presence of RNA expression from immune cells in the TME. It is unclear if SCellBOW adequately addresses this issue, which could affect the accuracy of the cancer subcluster vectors.

      The method of extracting vectors in phenotype algebra appears to be a straightforward subtraction operation. This simplicity might limit its efficiency in excluding associations with phenotypes from specific subpopulations, potentially leading to inaccurate interpretations of the data.

      The review would benefit from additional validation studies to assess the effectiveness of SCellBOW in distinguishing between cancerous and non-cancerous signals, particularly in heterogeneous tumor environments.

      Further clarification on how SCellBOW handles mixed-cell populations within bulk RNA-seq data would strengthen the evaluation of its applicability and reliability in diverse research settings.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors developed a novel tool, SCellBOW, to perform cell clustering and infer survival risks on individual cancer cell clusters from the single-cell RNA seq dataset. The key ideas/techniques used in the tool include transfer learning, bag of words (BOW), and phenotype algebra which is similar to word algebra from natural language processing (NLP). Comparisons with existing methods demonstrated that SCellBOW provides superior clustering results and exhibits robust performance across a wide range of datasets. Importantly, a distinguishing feature of SCellBOW compared to other tools is its ability to assign risk scores to specific cancer cell clusters. Using SCellBOW, the authors identified a new group of prostate cancer cells characterized by a highly aggressive and dedifferentiated phenotype.

      Strengths:

      The application of natural language processing (NLP) to single-cell RNA sequencing (scRNA-seq) datasets is both smart and insightful. Encoding gene expression levels as word frequencies is a creative way to apply text analysis techniques to biological data. When combined with transfer learning, this approach enhances our ability to describe the heterogeneity of different cells, offering a novel method for understanding the biological behavior of individual cells and surpassing the capabilities of existing cell clustering methods. Moreover, the ability of the package to predict risk, particularly within cancer datasets, significantly expands the potential applications.

      Weaknesses:

      Given the promising nature of this tool, it would be beneficial for the authors to test the risk-stratification functionality on other types of tumors with high heterogeneity, such as liver and pancreatic cancers, which currently lack clinically relevant and well-recognized stratification methods. Additionally, it would be worthwhile to investigate how the tool could be applied to spatial transcriptomics by analyzing cell embeddings from different layers within these tissues.

    4. Author response:

      Reviewer #1:

      This review evaluates the SCellBOW framework, which applies phenotype algebra to obtain vectors from cancer subclusters or user-defined subclusters.

      Strengths:

      SCellBOW employs an innovative application of NLP-inspired techniques to analyze scRNA-seq data, facilitating the identification and visualization of phenotypically divergent cell subpopulations. The framework demonstrates robustness in accurately representing various cell types across multiple datasets, highlighting its versatility and utility in different biological contexts. By simulating the impact of specific malignant subpopulations on disease prognosis, SCellBOW provides valuable insights into the relative risk and aggressiveness of cancer subpopulations, which is crucial for personalized therapeutic strategies. The identification of a previously unknown and aggressive AR−/NElow subpopulation in metastatic prostate cancer underscores the potential of SCellBOW in uncovering clinically significant findings.

      Major concerns:

      The reliance on bulk RNA-seq data as a reference raises concerns about potentially misleading results due to the presence of RNA expression from immune cells in the TME. It is unclear if SCellBOW adequately addresses this issue, which could affect the accuracy of the cancer subcluster vectors.

      To address the concern about potentially misleading results due to the TME when using bulk RNA-seq data as a reference:

      a. We account for systematic biases between the single-cell and bulk transcriptomics readouts by creating pseudo-bulk profiles for single-cell clusters, enabling more accurate comparisons.

      b. We encode expressions into word vectors and co-embed them together. By doing this, we mitigate any possibility of systematic differences in the embedding.

      c. It is imperative that we subject both single-cell and bulk data through the same treatments because otherwise, it will be difficult to perform algebraic operations on them.

      d. We rely on tumor bulk transcriptomics data from TCGA due to its high sample size and patient meta-data such as information pertaining to patient survival.

      We will discuss this in the revised manuscript.

      The method of extracting vectors in phenotype algebra appears to be a straightforward subtraction operation. This simplicity might limit its efficiency in excluding associations with phenotypes from specific subpopulations, potentially leading to inaccurate interpretations of the data.

      Vector algebra operations are not done in the gene expression space (i.e., gene expression vectors associated with tumor samples), rather we process the single cell and bulk expression profiles through multiple steps (pseudo-bulk vector generation for single cell clusters, mapping gene expression values to word frequencies as better understood by the Doc2vec neural networks etc.) to ensure their embeddings are consistent and capture intricate phenotypic information. We have demonstrated this through rigorous validation of the clusters yielded on various types of healthy and diseased samples. Furthermore, we have demonstrated the consistency of the vector algebra operations on known cancer subtypes in breast cancer, glioblastoma, and prostate cancer.

      We will discuss this in the revised manuscript.

      The review would benefit from additional validation studies to assess the effectiveness of SCellBOW in distinguishing between cancerous and non-cancerous signals, particularly in heterogeneous tumor environments.

      In our study, we are primarily interested in signals from malignant cells. However, we may consider scRNA-seq data with stromal cells and test whether SCellBOW can identify the influence of different stromal cell types on cancer aggressiveness.

      Further clarification on how SCellBOW handles mixed-cell populations within bulk RNA-seq data would strengthen the evaluation of its applicability and reliability in diverse research settings.

      We will elaborate on our discussion in the Result as well as Discussion sections.

      Reviewer #2:

      The authors developed a novel tool, SCellBOW, to perform cell clustering and infer survival risks on individual cancer cell clusters from the single-cell RNA seq dataset. The key ideas/techniques used in the tool include transfer learning, bag of words (BOW), and phenotype algebra which is similar to word algebra from natural language processing (NLP). Comparisons with existing methods demonstrated that SCellBOW provides superior clustering results and exhibits robust performance across a wide range of datasets. Importantly, a distinguishing feature of SCellBOW compared to other tools is its ability to assign risk scores to specific cancer cell clusters. Using SCellBOW, the authors identified a new group of prostate cancer cells characterized by a highly aggressive and dedifferentiated phenotype.

      Strengths:

      The application of natural language processing (NLP) to single-cell RNA sequencing (scRNA-seq) datasets is both smart and insightful. Encoding gene expression levels as word frequencies is a creative way to apply text analysis techniques to biological data. When combined with transfer learning, this approach enhances our ability to describe the heterogeneity of different cells, offering a novel method for understanding the biological behavior of individual cells and surpassing the capabilities of existing cell clustering methods. Moreover, the ability of the package to predict risk, particularly within cancer datasets, significantly expands the potential applications.

      Major concerns:

      Given the promising nature of this tool, it would be beneficial for the authors to test the risk-stratification functionality on other types of tumors with high heterogeneity, such as liver and pancreatic cancers, which currently lack clinically relevant and well-recognized stratification methods. Additionally, it would be worthwhile to investigate how the tool could be applied to spatial transcriptomics by analyzing cell embeddings from different layers within these tissue

      (1) Our selection of glioblastoma and breast cancer for this study was primarily driven by the focus on extensively studied and well-defined cancer types. To demonstrate the effectiveness of our model, we tested it on advanced prostate cancer, which currently lacks clinically relevant and well-recognized stratification methods. This application to metastatic prostate cancer serves as a proof of concept, illustrating our model's potential to provide valuable insights into cancer types where established stratification approaches are limited or absent. However, as suggested by the Reviewer, we will try to incorporate results for liver cancer, subject to the availability of adequate data for model building.

      (2) Regarding the application of our tool to spatial transcriptomics, we have already analyzed data from Digital Spatial Profiling (DSP). The article is already quite complex and involved, and we are afraid the inclusion of spatial transcriptomics may amount to a significant extension of the method. To this end, although we will discuss the future possibilities, we will skip the method validity check on spatial transcriptomics data.

    1. eLife assessment:

      This useful study shows that the essential Acinetobacter baumannii gene Aeg1 likely plays an key role in cell division. The strength of the work is the discovery that the depletion of Aeg1 leads to cell filamentation and that gain-of-function mutations in cell division genes FtsB and FtsL rescue the lethality of Aeg1 depletion. However, Aeg1's localization pattern and its requirement for other division proteins' localizations require further characterization of the functionality of fluorescent fusion proteins, fluorescence images of higher quality, and improvements in statistic qualifications, leaving the study' evidence for Aeg1's exact role in cell division incomplete at this time. In conclusion, the critical role of Aeg1 in the assembly of the A. baumannii divisome has yet to be established unambiguously.

    2. In this study, the authors confirm that one of the genes classified as essential in a Tn-mutagenesis study in A. baumannii, Aeg1, is, in fact, an essential gene. The strength of the work is that it discovered that the depletion of Aeg1 leads to cell filamentation and that activation mutations in various cell division genes can suppress the requirement for Aeg1. These results suggest that Aeg1 plays an important role in cell division. The work's weakness is that it lacks convincing evidence to define Aeg1's place or role in the divisome assembly pathway. It is unclear whether proteins are at the division site under the wildtype condition and when Aeg1 is depleted, and whether Aeg1 is indeed required for a set of division proteins to the division site.

      Reviewer comments:

      The revised manuscript partially addressed two of the three major concerns from the previous assessment: (1) the functionality test of fluorescent fusion proteins using a spotting assay, and (2) membrane protein topology in the bacterial two-hybrid assays by constructing a C-terminal T25 fusion.

      (1) In the spotting assay, all fluorescent fusion proteins rescued the growth of the corresponding deletion strain, which suggests these fusion proteins are functional. However, fluorescent images of these fusion proteins were diffusive, and only a few cells showed the expected midcell/membrane localization pattern for cell division proteins. This observation raised the concern that these fusion proteins may be cleaved in the middle, leading to the separation of the untagged fusion partner and diffusive fluorescent protein in the cytoplasm, which would explain the positive spotting rescue results. This phenomenon is commonly observed in other bacterial species. A western blot using an antibody targeting either the fluorescent protein or the fusion partner is widely used to examine whether the fusion protein is expressed at its full length.

      (2) The authors constructed a C-terminal fusion of Aeg1 and showed that it still interacted with ZipA and FtsN. This result supports the authors' suggestion that the N-terminus of Aeg1 may not be the predicated membrane-targeting domain. Along the same line, the membrane topology of ZipA should also be considered. ZipA's N terminus is in the membrane facing the periplasm, and its C terminal domain is in the cytoplasm. Therefore, the PUT18C fusion will place the T18 domain of ZipA in the periplasm. All other division proteins' N termini are in the cytoplasm.

      (3) Colocalization images did not show significant midcell localizations for each fluorescent protein; most cells showed diffusive cytoplasmic fluorescence. In all other species, midcell localization of cell division proteins is prominent in dividing cells, especially for early division proteins such as ZipA (at least 40-50% of cells show midcell bands). In A. baumannii, divisome localization timing may differ from other species, but this possibility needs to be established before the colocalization pattern is examined. Compounding this issue is that in Aeg1 depletion strains, some cells expressing ZipA, FtsB, FtsL, and FtsN fusions showed roughly regularly spaced puncta in long filamentous cells. It is hard to explain why this was observed if, under the WT condition, these fusions do not localize to the midcell. These results again raised concerns that these fusion proteins may not be functional and the observations are protein aggregates.

      Besides these major issues, experimental observations did not support some claims in the main text. For example: (1) In the two-hybrid assay, only ZipA and FtsN showed significant interactions with Aeg1, as judged by the darkness of the blue spots. FtsL and FtsB showed pale spots. The quantified values accompanying this figure did not appear to agree with the image. (2) The spotting rescue assay showed that only FtsB-E56A and FtsA-E202K was able to bypass Aeg1 depletion (full dilution set comparable to that of Aeg1 complementation), but the main text claimed that FtsA-D124A and V144L, and FtsW-M254I and S274G also rescued the growth. These claims could be misleading.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      (1) The reviewers asked to clarify the BTH assay: The fused T25 and T18 domains must be in the cytoplasmic to complement successfully. The authors stated that the N terminus of Aeg1 transverses the membrane once, which means that the T25-Aeg1 will have T25 in the periplasm. However, T18C vector fusion with other division proteins will have T18C of ZipA in the periplasm (ZipA's N terminus is on the periplasmic side of the inner membrane) while that of FtsN in the cytoplasm (FtsN's N terminus is in the cytoplasm). As such, it isn't easy to understand why T25-Aeg1 showed positive results for both ZipA and FtsN. Note that FtsL, FtsB, and FtsI all have the same topology as FtsN but showed negative results. It is possible that these fusion proteins do not fold correctly, and hence, the results cannot be interpreted directly. The authors did not address this concern but only cited that BTH is a commonly used assay for protein-protein interactions.

      In response to the editor's comments and the concerns raised by the reviewer, we have performed two sets of Aeg1-T25 fusion experiments to determine whether the Aeg1 topology impacts protein interactions measured by bacterial two-hybrid (BTH) assays. In the first set of experiments, we fused the T25 domain to the N-terminus of Aeg1 and still observed strong binding of Aeg1 to ZipA and FtsN, respectively. Similar results were obtained from the second set of experiments in which the T25 domain was fused to the C-terminus of Aeg1.

      These results indicate that the precise topology of Aeg1 does not significantly impact its ability to engage these binding partners. Aeg1 is predicted to harbor a single transmembrane domain, however, the precise location of this transmembrane segment differs in predictions made by different algorithms. The SMART Web site (1) predicted the transmembrane region to be located at the N-terminus of Aeg1 (7-29 aa). In contrast, Phobius, based on HMM (2, 3)suggested the transmembrane segment is situated more centrally within the Aeg1 protein (134-151 aa), and further proposed that the N-terminus may function as a signal peptide. This latter prediction also provides a potential explanation for the larger-than-expected molecular weight of the Aeg1 truncation mutant observed in the Western blot shown in Fig 1C. The removal of the putative signal peptide may have altered the protein structure, affecting its electrophoretic mobility. As a result, we are more inclined to favor the topology model for Aeg1 predicted by Phobius.

      (2) It is still difficult to identify the midcell localization patterns of Aeg1 and other division proteins from microscopy images (Fig. 4C and Fig. 5A). In Fig 4C, only ZipA and Aeg1 formed clear, regular band-like colocalization patterns. Others formed irregular co-localized puncta along the cell length, different from the expected midcell localization patterns. Cells also appeared to be much longer than WT cells, suggesting cell division defects. The most likely reason for these aberrant localization patterns and filamentous cells is that GFP/mCherry-fusions of these division proteins are not functional and become dominant negative, interfering with proper cell division. The authors need to test the functionality of these fusion proteins before they can be used for imaging. (The authors also mislabeled Hoechst and the division protein GFP panels labels in this figure.)

      Thank you for raising this important point. To examine the functionality of the fluorescence protein fusion constructs, we have painstakingly performed conditional knockout of the genes of interest (zipA, ftsB, ftsL, and ftsN) in A. baumannii strains inducibly expressing the corresponding fusion protein. We found that these fluorescence protein fusions were able to fully rescue the growth of the mutant lacking the corresponding fts gene (Figure 4-figure supplement 1). Concurrently, we have also successfully knocked out the aeg1 gene under conditions in trans expression of an mCherry-Aeg1 fusion protein, which was able to effectively rescue the growth defects of the Δa_eg1_ mutant (Figure 4-figure supplement 1). We then introduced the functional fluorescence protein fusions into wild-type cells and observed the co-localization of Aeg1 with the relevant Fts proteins. The results showed that Aeg1 indeed co-localized with ZipA, FtsB, FtsL, and FtsN (Fig.4E, red arrows), but occasional non-co-localization was also observed (Fig.4E, white arrows).

      We have utilized the functional fluorescence protein fusion constructs to analyze the localization of relevant Aeg1-interacting proteins in the Δ_aeg1_ strain upon Aeg1 depletion. Our results showed that the depletion of Aeg1 indeed impacted the midcell localization of the several Aeg1-interacting Fts proteins.

      References

      (1) Letunic I, Khedkar S, Bork P. SMART: recent updates, new developments and status in 2020. Nucleic acids research. 2021;49:D458-d60.doi: 10.1093/nar/gkaa937.

      (2) Käll L, Krogh A, Sonnhammer EL. A combined transmembrane topology and signal peptide prediction method. Journal of molecular biology. 2004;338:1027-36.doi: 10.1016/j.jmb.2004.03.016

      (3) Käll L, Krogh A, Sonnhammer EL. Advantages of combined transmembrane topology and signal peptide prediction--the Phobius web server. Nucleic acids research. 2007;35:W429-32.doi: 10.1093/nar/gkm256

    1. eLife assessment

      Working with a diverse panel of field-grown rice accessions, this valuable study measures changes in transcript abundance, tests for patterns of selection on gene expression, and maps the genetic basis of variation in gene expression in normal and high salinity conditions. The authors provide solid evidence that salinity treatment increases the number of genes with mean expression levels away from the optimum, and that a relatively small number of genes are hotspots for genetic variants that affect genome-wide patterns of variation in gene expression under high salinity. The design, clarity, and interpretation of several statistical analyses can be improved, additional opportunities for integration among datasets and analyses could be realized, and genetic manipulation would be required to confirm the functional involvement of any specific genes in regulatory networks or organismal traits that confer adaptation to higher salinity conditions. The manuscript will not only be of interest to evolutionary biologists studying the genetics of complex traits, but it will also be a resource for plant biologists studying mechanisms of abiotic stress tolerance.

    2. Reviewer #1 (Public Review):

      Summary:

      Understanding the mechanisms of how organisms respond to environmental stresses is a key goal of biological research. Assessment of transcriptional responses to stress can provide some insights into those underlying mechanisms. The researchers quantified traits, fitness, and gene expression (transcriptional) response to salinity stress (control vs stress treatments) for 130 accessions of rice (three replicates for each accession), which were grown in the field in the Philippines. This experimental design allowed for many different types of downstream analyses to better understand the biology of the system. These analyses included estimating the strength of selection imposed on transcription in each environment, evaluating possible trade-offs in gene expression, testing whether salinity induces transcriptional decoherence, and conducting various eQTL-type analyses.

      Strengths:

      The study provides an extensive analysis of gene expression responses to stress in rice and offers some insights into underlying mechanisms of salinity responses in this important crop system. The fact that the study was conducted under field conditions is a major plus, as the gene expression responses to soil salinity are more realistic than if the study was conducted in a greenhouse or growth chamber. The preprint is generally well-written and the methods and results are mostly well-described.

      Weaknesses:

      While the study makes good use of analyzing the dataset, it is not clear how the current work advances our understanding of gene regulatory evolution or plant responses to soil salinity generally. Overall, the results are consistent with other prior studies of gene expression and studies of selection across environmental conditions. Some of the framing of the paper suggests that there is more novelty to this study than there is in reality. That said, the results will certainly be useful for those working in rice and should be interesting to scientists interested in how gene expression responses to stress occur under field conditions. I detail other concerns I had about the preprint below:

      The abstract on lines 33-35 illustrates some of my concerns about the overstatement of the novelty of the current study. For example, is it really true that the role of gene expression in mediating stress response and adaptation is largely unexplored? There have been numerous studies that have evaluated gene expression responses to stresses in a wide range of organisms. Perhaps, I am missing something critically different about this study. If so, I would recommend that the authors reword this sentence to clarify what gap is being filled by this study. Further, is it really the case that none of them have evaluated how the correlational structure of gene expression changes in response to stresses in plants, as implied in lines 263-265? Don't the various modules and PC analyses of gene expression get at this question?

      There were some places in the methods of the preprint that required more information to properly evaluate. For example, more information should be provided on lines 664-668 about how G, E, and GxE effects were established, especially since this is so central to this study. What programs/software (R? SAS? Other?) were used for these analyses? If R, how were the ANOVAs/models fit? What type of ANOVA was used? How exactly was significance determined for each term? Which effects were considered fixed and which were random? If the goal was to fit mixed models, why not use an approach like voom-limma (Law et al. 2014 Genome Biology)? More details should also be added to lines 688-709 about these analyses, including what software/programs were used for these analyses.

      One thing that I found a bit confusing throughout was the intermixing of different terms and types of selection. In particular, there seemed to be some inconsistencies with the usage of quantitative genetics terms for selection (e.g. directional, stabilizing) vs molecular evolution terms for selection (e.g. positive, purifying). I would encourage the authors to think carefully about what they mean by each of these terms and make sure that those definitions are consistently applied here.

      It would be useful to clarify the reasons for the inherent bias in the detection of conditional neutrality (CN) and antagonistic pleiotropy (AP; Lines 187-196). It is also not clear to me what the authors did to deal with the bias in terms of adjusting P-value thresholds for CN and AP the way it is currently written. Further, I found the discussion of antagonistic pleiotropy and conditional neutrality to be a bit confusing for a couple of reasons, especially around lines 489-491. First of all, does it really make sense to contrast gene expression versus local adaptation, when lots of local adaptation likely involves changes in gene expression? Second, the implication that antagonistic pleiotropy is more common for local adaptation than the results found in this study seems questionable. Conditional neutrality appears to be more common for local adaptation as well: see Table 2 of Wadgymar et al. 2017 Methods in Ecology and Evolution. That all said, it is always difficult to conclude that there are no trade-offs (antagonistic pleiotropy) for a particular locus, as the detecting trade-offs may only manifest in some years and not others and can require large sample sizes if they are subtle in effect.

    3. Reviewer #2 (Public Review):

      The authors investigate the gene expression variation in a rice diversity panel under normal and saline growth conditions to gain insight into the underlying molecular adaptive response to salinity. They present a convincing case to demonstrate that environmental stress can induce selective pressure on gene expression, which is in agreement to their earlier study (Groen et al, 2020). The data seems to be a good fit for their study and overall the analytic approach is robust.

      (1) The work started by investigating the effect of genotype and their interaction at each transcript level using 3'-end-biased mRNA sequencing, and detecting a wide-spread GXE effect. Later, using the total filled grain number as a proxy of fitness, they estimated the strength of selection on each transcript and reported stronger selective pressure in a saline environment. However, this current framework relies on precise estimation of fitness and, therefore can be sensitive to the choice of fitness proxy.

      (2) Furthermore, the authors decomposed the genetic architecture of expression variation into cis- and trans-eQTL in each environment separately and reported more unique environment-specific trans-eQTLs than cis-. The relative contribution of cis- and trans-eQTL depends on both the abundance and effect size. I wonder why the latter was not reported while comparing these two different genetic architectures. If the authors were to compare the variation explained by these two categories of eQTL instead of their frequency, would the inference that trans-eQTLs are primarily associated with expression variation still hold?

      (3) Next, the authors investigated the relationship between cis- and trans-eQTLs at the transcript level and revealed an excess of reinforcement over the compensation pattern. Here, I struggle to understand the motivation for testing the relationship by comparing the effect of cis-QTL with the mean effect of all trans-eQTLs of a given transcript. My concern is that taking the mean can diminish the effect of small trans-eQTLs potentially biasing the relationship towards the large-effect eQTLs.

    4. Reviewer #3 (Public Review):

      In this work, the authors conducted a large-scale field trial of 130 indica accessions in normal vs. moderate salt stress conditions. The experiment consists of 3 replicates for each accession in each treatment, making it 780 plants in total. Leaf transcriptome, plant traits, and final yield were collected. Starting from a quantitative genetics framework, the authors first dissected the heritability and selection forces acting on gene expression. After summarizing the selection force acting on gene expression (or plant traits) in each environment, the authors described the difference in gene expression correlation between environments. The final part consists of eQTL investigation and categorizing cis- and trans-effects acting on gene expression.

      Building on the group's previous study and using a similar methodology (Groen et al. 2020, 2021), the unique aspect of this study is in incorporating large-scale empirical field works and combining gene expression data with plant traits. Unlike many systems biology studies, this study strongly emphasizes the quantitative genetics perspective and investigates the empirical fitness effects of gene expression data. The large amounts of RNAseq data (one sample for each plant individual) also allow heritability calculation. This study also utilizes the population genetics perspective to test for traces of selection around eQTL. As there are too many genes to fit in multiple regression (for selection analysis) and to construct the G-matrix (for breeder's equation), grouping genes into PCs is a very good idea.

      Building on large amounts of data, this study conducted many analyses and described some patterns, but a central message or hypothesis would still be necessary. Currently, the selection analysis, transcript correlation structure change, and eQTL parts seem to be independent. The manuscript currently looks like a combination of several parallel works, and this is reflected in the Results, where each part has its own short introduction (e.g., 185-187, 261-266, 349-353). It would be great to discuss how these patterns observed could be translated to larger biological insights. On a related note, since this and the previous studies (focusing on dry-wet environments) use a similar methodology, one would also wonder what the conclusions from these studies would be. How do they agree or disagree with each other?

      Many analyses were done separately for each environment, and results from these two environments are listed together for comparison. Especially for the eQTL part, no specific comparison was discussed between the two environments. It would be interesting to consider whether one could fit the data in more coherent models specifically modeling the X-by-environment effects, where X might be transcripts, PCs, traits, transcript-transcript correlation, or eQTLs.

      As stated, grouping genes into PCs is a good idea, but although in theory, the PCs are orthogonal, each gene still has some loadings on each PC (ie. each PC is not controlled by a completely different set of genes). Another possibility is to use any gene grouping method, such as WGCNA, to group genes into modules and use the PC1 of each module. There, each module would consist of completely different sets of genes, and one would be more likely to separate the biological functions of each module. I wonder whether the authors could discuss the pros and cons of these methods.

    5. Reviewer #4 (Public Review):

      The manuscript examines how patterns of selection on gene expression differ between a normal field environment and a field environment with elevated salinity based on transcript abundances obtained from leaves of a diverse panel of rice germplasm. In addition, the manuscript also maps expression QTL (eQTL) that explains variation in each environment. One highlight from the mapping is that a small group of trans-mapping regulators explains some gene expression variation for large sets of transcripts in each environment. The overall scope of the datasets is impressive, combining large field studies that capture information about fecundity, gene expression, and trait variation at multiple sites. The finding related to patterns indicating increased LD among eQTLs that have cis-trans compensatory or reinforcing effects is interesting in the context of other recent work finding patterns of epistatic selection. However, other analyses in the manuscript are less compelling or do not make the most of the value of collected data. Revisions are also warranted to improve the precision with which field-specific terminology is applied and the language chosen when interpreting analytical findings.

      Selection of gene expression:<br /> One strength of the dataset is that gene expression and fecundity were measured for the same genotypes in multiple environments. However, the selection analyses are largely conducted within environments. The addition of phenotypic selection analyses that jointly analyze gene expression across environments and or selection on reaction norms would be worthwhile.

      Gene expression trade-offs:<br /> The terminology and possibly methods involved in the section on gene expression trade-offs need amendment. I specifically recommend discontinuing reference to the analysis presented as an analysis of antagonistic pleiotropy (rather than more general trade-offs) because pleiotropy is defined as a property of a genotype, not a phenotype. Gene expression levels are a molecular phenotype, influenced by both genotype and the environment. By conducting analyses of selection within environments as reported, the analysis does not account for the fact that the distribution of phenotypic values, the fitness surface, or both may differ across environments. Thus, this presents a very different situation than asking whether the genotypic effect of a QTL on fitness differs across environments, which is the context in which the contrasting terms antagonistic pleiotropy and conditional neutrality have been traditionally applied. A more interesting analysis would be to examine whether the covariance of phenotype with fitness has truly changed between environments or whether the phenotypic distribution has just shifted to a different area of a static fitness surface.

      Biological processes under selection / Decoherence: PCs are likely not the most ideal way to cluster genes to generate consolidated metrics for a selection gradient analysis. Because individual genes will contribute to multiple PCs, the current fractional majority-rule method applied to determine whether a PC is under direct or indirect selection for increased or decreased expression comes across as arbitrary and with the potential for double-counting genes. A gene co-expression network analysis could be more appropriate, as genes only belong to one module and one can examine how selection is acting on the eigengene of a co-expression module. Building gene co-expression modules would also provide a complementary and more concrete framework for evaluating whether salinity stress induces "decoherence" and which functional groups of genes are most impacted.

      Selection of traits:<br /> Having paired organismal and molecular trait data is a strength of the manuscript, but the organismal trait data are underutilized. The manuscript as written only makes weak indirect inferences based on GO categories or assumed gene functions to connect selection at the organismal and molecular levels. Stronger connections could be made for instance by showing a selection of co-expression module eigengene values that are also correlated with traits that show similar patterns of selection, or by demonstrating that GWAS hits for trait variation co-localize to cis-mapping eQTL.

      Genetic architecture of gene expression variation:<br /> The descriptive statistics of the eQTL analysis summarize counts of eQTLs observed in each environment, but these numbers are not broken down to the molecular trait level (e.g., what are the median and range of cis- and trans-eQTLs per gene). In addition, genetic architecture is a combination of the numbers and relative effect sizes of the QTLs. It would be useful to provide information about the relative distributions of phenotypic variance explained by the cis- vs. trans- eQTLs and whether those distributions vary by environment. The motivation for examining patterns of cis-trans compensation specifically for the results obtained under high salinity conditions is unclear to me. If the lines sampled have predominantly evolved under low salinity conditions and the hypothesis being evaluated relates to historical experience of stabilizing selection, then my intuition is that evaluating the eQTL patterns under normal conditions provides the more relevant test of the hypothesis.

    6. Author response:

      Reviewer #1:

      (1) Clarification of Novelty and Contribution:

      - We agree that the novelty of our study could have been better articulated. We will more clearly define the specific gaps in knowledge our study addresses. We will also clarify the novelty in our analysis of the correlational structure of gene expression under stress.

      (2) Methodological Details:

      - We acknowledge the need for additional detail in the methods section regarding the estimation of G, E, and GxE effects. We will expand this section to include the software used (R), the specific ANOVA models applied, and how significance was determined. We will also clarify which effects were treated as fixed or random effects.

      (3) Terminology Consistency:

      - We will thoroughly review the manuscript to ensure consistent use of selection-related terminology. This will involve distinguishing between quantitative genetics terms (e.g., irectional, stabilizing) and molecular evolution terms (e.g., positive, purifying) to avoid any confusion.

      (4) Bias in Conditional Neutrality and Antagonistic Pleiotropy:

      - We appreciate the suggestion to clarify the discussion around conditional neutrality (CN) and antagonistic pleiotropy (AP). We will elaborate on the inherent bias in detecting CN and P and specify how we adjusted P-value thresholds. Additionally, we will try to refine the discussion to address the concerns raised about the comparison of gene expression and local adaptation, incorporating relevant literature.

      Reviewer #2:

      (1) Sensitivity of Fitness Proxy:

      - We acknowledge the limitations of using the total filled grain number as a fitness proxy. We will include a discussion on the potential sensitivity of our results to this choice.

      (2) Cis- and trans-eQTL Contributions:

      - We appreciate the suggestion to report effect sizes in addition to the frequency of cis- and trans-eQTLs. We will incorporate this into our analysis and discuss whether our conclusions regarding the predominance of trans-eQTLs in expression variation hold when considering effect sizes.

      (3) Cis-Trans Relationship Analysis:

      - Since we wanted to estimate compensating vs. reinforcing effects, this essentially entails identifying genes that have opposing directionality of cis and trans-effects. To get the total trans-effect we decided to take the mean effect of trans-eQTLs. This mean was only used to identify the compensating/reinforcing genes and although the mean effects diminishes the effect of small trans-eQTLs, this metric was not used in downstream analyses.

      Reviewer #3:

      (1) Integration of Analyses:

      - We acknowledge that the manuscript currently presents some analyses in a somewhat independent manner. Although it would be ideal to have a central hypothesis/message, our study is meant to broadly outline the various responses and fitness effects of salinity stress on rice. Throughout the manuscript, we have also included comparisons between our findings and that of our previous studies on drought stress to highlight any consistent themes or novel insights.

      (2) X-by-Environment Effects:

      - We do plan to consider fitting models that explicitly incorporate X-by-environment interactions to provide a more detailed understanding of the genetics of plasticity between the two environments, but it is beyond the scope of this paper. This will be explored in a separate report.

      (3) Gene Grouping Methods:<br /> - We will try to discuss the pros and cons of using PCA versus gene co-expression network analysis (e.g., WGCNA) for grouping genes. We will also explore applying WGCNA in our analysis to see if it offers any additional insights or clarity.

      Reviewer #4:

      (1) Selection Analysis Across Environments:

      - We do plan to consider fitting models that explicitly incorporate G×E interactions to provide a more detailed understanding of the genetics of plasticity between the two environments, but it is beyond the scope of this paper. This will be explored in a separate report.

      (2) Gene Expression Trade-Offs Terminology:

      - We will revise our terminology to better reflect the nature of the trade-offs observed, and explore variation in covariance between phenotype and fitness between the two environments.

      (3) Biological Processes and Decoherence:

      - We will explore applying WGCNA in our analysis to see if it offers any additional insights or clarity.

      (4) Underutilization of Organismal Traits:

      - We did perform GWAS for all the traits measured in both environments, but did not find any significant hits. We will examine whether selection of co-expression modules are correlated with the traits, and may incorporate it in our manuscript depending on the results.

      (5) Detailed eQTL Analysis:

      - We will expand our eQTL analysis to include detailed statistics at the molecular trait level, including the phenotypic variance explained by cis- and trans-eQTLs and how these vary by environment.

      Although we focus on salinity conditions in our cis-trans compensation analysis in the main results, we have provided comparisons for all our eQTL analyses between normal and salinity conditions in the main text (with figures as supplementary).<br /> We are confident that these revisions will significantly strengthen our manuscript and address the concerns raised by the reviewers. We look forward to submitting a revised version that better communicates the significance and robustness of our findings.<br /> Thank you again for your valuable feedback.

    1. eLife assessment

      This study provides new insights into the expression profile of ILCs that demonstrate a history of RAG expression. It examines in part the potential intrinsic regulation of RAG expression and seeks to understand how the epigenetic state of ILCs is established, although a full understanding of intrinsic factors is incomplete. The work provides an important molecular dataset, and with further strengthening of the understanding of intrinsic regulation, this paper would be of interest more broadly to cell biologists seeking to understand immune cell development.

    2. Reviewer #1 (Public Review):

      The study starts with the notion that in an AD-like disease model, ILC2s in the Rag1 knock-out were expanded and contained relatively more IL-5+ and IL-13+ ILC2s. This was confirmed in the Rag2 knock-out mouse model.

      By using a chimeric mouse model in which wild-type knock-out splenocytes were injected into irradiated Rag1 knock-out mice, it was shown that even though the adaptive lymphocyte compartment was restored, there were increased AD-like symptoms and increased ILC2 expansion and activity. Moreover, in the reverse chimeric model, i.e. injecting a mix of wild-type and Rag1 knock-out splenocytes into irradiated wild-type animals, it was shown that the Rag1 knock-out ILC2s expanded more and were more active. Therefore, the authors could conclude that the RAG1 mediated effects were ILC2 cell-intrinsic.

      Subsequent fate-mapping experiments using the Rag1Cre;reporter mouse model showed that there were indeed RAGnaïve and RAGexp ILC2 populations within naïve mice. Lastly, the authors performed multi-omic profiling, using single-cell RNA sequencing and ATAC-sequencing, in which a specific gene expression profile was associated with ILC2. These included well-known genes but the authors notably also found expression of Ccl1 and Ccr8 within the ILC2. The authors confirmed their earlier observations that in the RAGexp ILC2 population, the Th2 regulome was more suppressed, i.e. more closed, compared to the RAGnaïve population, indicative of the suppressive function of RAG on ILC2 activity. I do agree with the authors' notion that the main weakness was that this study lacks the mechanism by which RAG regulates these changes in ILC2s.

      The manuscript is very well written and easy to follow, and the compelling conclusions are well supported by the data. The experiments are meticulously designed and presented. I wish to commend the authors for the study's quality.

      Even though the study is compelling and well supported by the presented data, some additional context could increase the significance:

      (1) The presence of the RAGnaïve and RAGexp ILC2 populations raises some questions on the (different?) origin of these populations. It is known that there are different waves of ILC2 origin (most notably shown in the Schneider et al Immunity 2019 publication, PMID 31128962). I believe it would be very interesting to further discuss or possibly show if there are different origins for these two ILC populations.

      Several publications describe the presence and origin of ILC2s in/from the thymus (PMIDs 33432227 24155745). Could the authors discuss whether there might be a common origin for the RAGexp ILC2 and Th2 cells from a thymic lineage? If true that the two populations would be derived from different populations, e.g. being the embryonic (possibly RAGnaïve) vs. adult bone marrow/thymus (possibly RAGexp), this would show a unique functional difference between the embryonic derived ILC2 vs. adult ILC2.

      (2) On line 104 & Figures 1C/G etc. the authors describe that in the RAG knock-out ILC2 are relatively more abundant in the lineage negative fraction. On line 108 they further briefly mentioned that this observation is an indication of enhanced ILC2 expansion. Since the study includes an extensive multi-omics analysis, could the authors discuss whether they have seen a correlation of RAG expression in ILC2 with regulation of genes associated with proliferation, which could explain this phenomenon?

    3. Reviewer #2 (Public Review):

      Summary:

      The study by Ver Heul et al., investigates the consequences of RAG expression for type 2 innate lymphoid cell (ILC2) function. RAG expression is essential for the generation of the receptors expressed by B and T cells and their subsequent development. Innate lymphocytes, which arise from the same initial progenitor populations, are in part defined by their ability to develop in the absence of RAG expression. However, it has been described in multiple studies that a significant proportion of innate lymphocytes show a history of Rag expression. In compelling studies several years ago, members of this research team revealed that early Rag expression during the development of Natural Killer cells (Karo et al., Cell 2014), the first described innate lymphocyte, had functional consequences.

      Here, the authors revisit this topic, a worthwhile endeavour given the broad history of Rag expression within all ILCs and the common use of RAG-deficient mice to specifically assess ILC function. Focusing on ILC2s and utilising state-of-the-art approaches, the authors sought to understand whether early expression of Rag during ILC2 development had consequences for activity, fitness, or function. Having identified cell-intrinsic effects in vivo, the authors investigated the causes of this, identifying epigenetic changes associated with the accessibility genes associated with core ILC2 functions.

      The manuscript is well written and does an excellent job of supporting the reader through reasonably complex transcriptional and epigenetic analyses, with considerate use of explanatory diagrams. Overall I think that the conclusions are fair, the topic is thought-provoking, and the research is likely of broad immunological interest. I think that the extent of functional data and mechanistic insight is appropriate.

      Strengths:

      - The logical and stepwise use of mouse models to first demonstrate the impact on ILC2 function in vivo and a cell-intrinsic role. Initial analyses show enhanced cytokine production by ILC2 from RAG-deficient mice. Then through two different chimeric mice (including BM chimeras), the authors convincingly show this is cell intrinsic and not simply as a result of lymphopenia. This is important given other studies implicating enhanced ILC function in RAG-/- mice reflect altered competition for resources (e.g. cytokines).

      - Use of Rag expression fate mapping to support analyses of how cells were impacted - this enables a robust platform supporting subsequent analyses of the consequences of Rag expression for ILC2.

      - Use of snRNA-seq supports gene expression and chromatin accessibility studies - these reveal clear differences in the data sets consistent with altered ILC2 function.

      - Convincing evidence of epigenetic changes associated with loci strongly linked to ILC2 function. This forms a detailed analysis that potentially helps explain some of the altered ILC2 functions observed in ex vivo stimulation assays.

      - Provision of a wealth of expression data and bioinformatics analyses that can serve as valuable resources to the field.

      Weaknesses:

      - Lack of insight into precisely how early RAG expression mediates its effects, although I think this is beyond the scale of this current manuscript. Really this is the fundamental next question from the data provided here.

      - The epigenetic analyses provide evidence of differences in the state of chromatin, but there is no data on what may be interacting or binding at these sites, impeding understanding of what this means mechanistically.

      - Focus on ILC2 from skin-draining lymph nodes rather than the principal site of ILC2 activity itself (the skin). This may well reflect the ease at which cells can be isolated from different tissues.

      - Comparison with ILC2 from other sites would have helped to substantiate findings and compensate for the reliance on data on ILC2 from skin-draining lymph nodes, which are not usually assessed amongst ILC2 populations.

      - The studies of how ILC2 are impacted are a little limited, focused exclusively on IL-13 and IL-5 cytokine expression.

    4. Reviewer #3 (Public Review):

      In this study, Ver Heul et al. investigate the role of RAG expression in ILC2 functions. While RAG genes are not required for the development of ILCs, previous studies have reported a history of expression in these cells. The authors aim to determine the potential consequences of this expression in mature cells. They demonstrate that ILC2s from RAG1 or RAG2 deficient mice exhibit increased expression of IL-5 and IL-13 and suggest that these cells are expanded in the absence of RAG expression. However, it is unclear whether this effect is due to a direct impact of RAG genes or a consequence of the lack of T and B cells in this condition. This ambiguity represents a key issue with this study: distinguishing the direct effects of RAG genes from the indirect consequences of a lymphopenic environment.

      The authors focus their study on ILC2s found in the skin-draining lymph nodes, omitting analysis of tissues where ILC2s are more enriched, such as the gut, lungs, and fat tissue. This approach is surprising given the goal of evaluating the role of RAG genes in ILC2s across different tissues. The study shows that ILC2s derived from RAG-/- mice are more activated than those from WT mice, and RAG-deficient mice show increased inflammation in an atopic dermatitis (AD)-like disease model. The authors use an elegant model to distinguish ILC2s with a history of RAG expression from those that never expressed RAG genes. However, this model is currently limited to transcriptional and epigenomic analyses, which suggest that RAG genes suppress the type 2 regulome at the Th2 locus in ILC2s.

      The authors report a higher frequency of ILC2s in RAG-/- mice in skin-draining lymph nodes, which is expected as these mice lack T and B cells, leading to ILC expansion. Previous studies have reported hyper-activation of ILCs in RAG-deficient mice, suggesting that this is not necessarily an intrinsic phenomenon. For example, RAG-/- mice exhibit hyperphosphorylation of STAT3 in the gut, leading to hyperactivation of ILC3s. This study does not currently provide conclusive evidence of an intrinsic role of RAG genes in the hyperactivation of ILC2s. The splenocyte chimera model is artificial and does not reflect a normal environment in tissues other than the spleen. Similarly, the mixed BM model does not demonstrate an intrinsic role of RAG genes, as RAG1-/- BM cells cannot contribute to the B and T cell pool, leading to an expected expansion of ILC2s. As the data are currently presented it is expected that a proportion of IL-5-producing cells will come from the RAG1-/- BM.

      Overall, the level of analysis could be improved. Total cell numbers are not presented, the response of other immune cells to IL-5 and IL-13 (except the eosinophils in the splenocyte chimera mice) is not analyzed, and the analysis is limited to skin-draining lymph nodes.

      The authors have a promising model in which they can track ILC2s that have expressed RAG or not. They need to perform a comprehensive characterization of ILC2s in these mice, which develop in a normal environment with T and B cells. Approximately 50% of the ILC2s have a history of RAG expression. It would be valuable to know whether these cells differ from ILC2s that never expressed RAG, in terms of proliferation and expression of IL-5 and IL-13. These analyses should be conducted in different tissues, as ILC2s adapt their phenotype and transcriptional landscape to their environment. Additionally, the authors should perform their AD-like disease model in these mice.

      The authors provide a valuable dataset of single-nuclei RNA sequencing (snRNA-seq) and ATAC sequencing (snATAC-seq) from RAGexp (RAG fate map-positive) and RAGnaïve (RAG fate map-negative) ILC2s. This elegant approach demonstrates that ILC2s with a history of RAG expression are epigenomically suppressed. However, key genes such as IL-5 and IL-13 do not appear to be differentially regulated between RAGexp and RAGnaïve ILC2s according to Table S5. Although the authors show that the regulome activity of IL-5 and IL-13 is decreased in RAGexp ILC2s, how do the authors explain that these genes are not differentially expressed between the RAGexp and RAGnaïve ILC2? I think that it is important to validate this in vivo.

    1. eLife assessment

      This study provides a valuable characterization of individual sarcomere's contractility and synchrony in spontaneously beating cardiomyocytes as a function of substrate stiffness. The authors, however, provide an incomplete explanation for the observed heterogeneous and stochastic dynamics, so that the work remains mainly descriptive. The work will be of interest to scientists working on muscle biophysics, nonlinear dynamics, and synchronization phenomena in biological systems.

    2. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, the authors experimentally demonstrated the heterogeneous behavior of sarcomeres in cardiomyocytes and that a stochastic component exists in their contractile activity, which cancels out at the level of myofibrils.

      Strengths:

      The experiments and data analysis are robust and valid. With very good statistics and unbiased methods, they show cellular activity at the individual level and highlight the heterogeneity between biological networks. The similarity of the results to the study cited in [24] demonstrates the validity of the in vitro setup for answering these questions and the feasibility of such in-vitro systems to extend our knowledge of physiology.

      Weaknesses:

      Compared to the current literature ([24]), the study does not show a high degree of innovation. It mainly confirms what has been established in the past. The authors complemented the published experiments by developing an in vitro setup with stem cells and by changing the stiffness of the substrate to simulate pathological conditions. However, the experiments they performed do not allow them to explain more than the study in [24], and the conclusions of their study are based on interpretation and speculation about the possible mechanism underlying the observations.

    3. Reviewer #2 (Public Review):

      Summary:

      Sarcomeres, the contractile units of skeletal and cardiac muscle, contract in a concerted fashion to power myofibril and thus muscle fiber contraction.

      Muscle fiber contraction depends on the stiffness of the elastic substrate of the cell, yet it is not known how this dependence emerges from the collective dynamics of sarcomeres. Here, the authors analyze the contraction time series of individual sarcomeres using live imaging of fluorescently labeled cardiomyocytes cultured on elastic substrates of different stiffness. They find that reduced collective contractility of muscle fibers on unphysiologically stiff substrates is partially explained by a lack of synchronization in the contraction of individual sarcomeres.

      This lack of synchronization is at least partially stochastic, consistent with the notion of a tug-of-war between sarcomeres on stiff sarcomeres. A particular irregularity of sarcomere contraction cycles is 'popping', the extension of sarcomeres beyond their rest length. The statistics of 'popping' suggest that this is a purely random process.

      Strengths:

      This study thus marks an important shift of perspective from whole-cell analysis towards an understanding of the collective dynamics of coupled, stochastic sarcomeres.

      Weaknesses:

      Further insight into mechanisms could be provided by additional analyses and/or comparisons to mathematical models.

    4. Reviewer #3 (Public Review):

      Summary:

      The manuscript of Haertter and coworkers studied the variation of length of a single sarcomere and the response of microfibrils made by sarcomeres of cardiomyocytes on soft gel substrates of varying stiffnesses.

      The measurements at the level of a single sarcomere are an important new result of this manuscript. They are done by combining the labeling of the sarcomeres z line using genetic manipulation and a sophisticated tracking program using machine learning. This single sarcomere analysis shows strong heterogeneities of the sarcomeres that can show fast oscillations not synchronized with the average behavior of the cell<br /> and what the authors call popping events which are large amplitude oscillations. Another important result is the fact that cardiomyocyte contractility decreases with the substrate stiffness although the properties of single sarcomeres do not seem to depend on substrate stiffness.

      The authors suggest that the cardiomyocyte cell behavior is dominated by sarcomere heterogeneity. They show that the heterogeneity between sarcomeres is stochastic and that the contribution of static heterogeneity (such as composition differences between sarcomeres)<br /> is small.

      Strengths:

      All the results are to my knowledge new and original and deserve attention.

      Weaknesses:

      However, I find the manuscript a bit frustrating because the authors only give very qualitative explanations of the phenomena that they observe. They mention that popping could be explained by a nonlinear force-velocity relation of the sarcomere leading to a rapid detachment of all motors. However, they do not explicitly provide a theoretical description. How would the popping depend on the parameters and in particular on the substrate stiffness? Would the popping statistics be affected by the stiffness? It is also not clear to me how the dependence on the soft gel stiffness of the cardiomyocyte cell can be explained by the stochasticity of the sarcomere properties. Can any of the results found by the authors be explained by existing theories of cardiomyocytes? The only one I know is that of Safran and coworkers.

      I also found the paper very difficult to read. The authors should perhaps reorganize the structure of the presentation in order to highlight what the new and important results are.

    1. eLife assessment

      This study demonstrates a novel role for SIRT4; a mitochondrial deacetylase, shown to translocate into nuclei where it regulates RNA alternative splicing by modulating U2AF2 and the gene expression of CCN2 in tubular cells in response to TGF-β. This fundamental work substantially advances our understanding of kidney fibrosis development and offers a potential therapeutic approach. The evidence supporting the conclusions of a SIRT4-U2AF2-CCN2 axis activated by TGF-β is compelling and adds a new layer of complexity to the pathogenesis of chronic kidney disease.

    2. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Yang et al report a novel regulatory role of SIRT4 in the progression of kidney fibrosis. The authors showed that in the fibrotic kidney, SIRT4 exhibited an increased nuclear localization. Deletion of Sirt4 in renal tubule epithelium attenuated the extent of kidney fibrosis following injury, while overexpression of SIRT4 aggravates kidney fibrosis. Employing a battery of in vitro and in vivo experiments, the authors demonstrated that SIRT4 interacts with U2AF2 in the nucleus upon TGF-β1 stimulation or kidney injury and deacetylates U2AF2 at K413, resulting in elevated CCN2 expression through alternative splicing of Ccn2 gene to promote kidney fibrosis. The authors further showed that the translocation of SIRT4 is through the BAX/BAK pore complex and is dependent on the ERK1/2-mediated phosphorylation of SIRT4 at S36, and consequently the binding of SIRT4 to importin α1. This fundamental work substantially advances our understanding of the progression of kidney fibrosis and uncovers a novel SIRT4-U2AF2-CCN2 axis as a potential therapeutic target for kidney fibrosis.

      Strengths:

      Overall, this is an extensive, well-performed study. The results are convincing, and the conclusions are mostly well supported by the data. The message is interesting to a wider community working on kidney fibrosis, protein acetylation, and SIRT4 biology.

      Weaknesses:

      The manuscript could be further strengthened if the authors could address a few points listed below:

      (1) In the results part 3.9, an in vitro deacetylation assay employing recombinant SIRT4 and U2AF2 should be included to support the conclusion that SIRT4 is a deacetylase of U2AF2. Similarly, an in vitro binding assay can be included to confirm whether SIRT4 and U2AF2 are directly interacted.

      (2) In Figure 6D, the Western Blot data using U2AF2-K453Q is confusing and is quite disconnected from the rest of the data and not explained. This data can be removed or explained why U2AF2-K453Q is employed here.

      (3) Although ERK inhibitor U0126 blocked the nuclear translocation of SIRT4 in vivo, have the authors checked whether treatment with U0126 could affect the expression of kidney fibrosis markers in UUO mice?

      (4) The format of gene and protein abbreviations in the manuscript should be standardized.

      (5) There are a few grammar issues throughout the manuscript. The English/grammar could be stronger, thus improving the overall accessibility of the science to readers.

    3. Reviewer #2 (Public Review):

      Summary:

      This manuscript presents a novel and significant investigation into the role of SIRT4 For CCN2 expression in response to TGF-β by modulating U2AF2-mediated alternative splicing and its impact on the development of kidney fibrosis.

      Strengths:

      The authors' main conclusion is that SIRT4 plays a role in kidney fibrosis by regulating CCN2 expression via pre-mRNA splicing. Additionally, the study reveals that SIRT4 translocates from the mitochondria to the cytoplasm through the BAX/BAK pore under TGF-β stimulation. In the cytoplasm, TGF-β activated the ERK pathway and induced the phosphorylation of SIRT4 at Ser36, further promoting its interaction with importin α1 and subsequent nuclear translocation. In the nucleus, SIRT4 was found to deacetylate U2AF2 at K413, facilitating the splicing of CCN2 pre-mRNA to promote CCN2 protein expression. Overall, the findings are fully convincing. The current study, to some extent, shows potential importance in this field. 

      Weaknesses:

      (1) Exosomes containing anti-SIRT4 antibodies were found to effectively mitigate UUO-induced kidney fibrosis in mice. While the protein loading capacity and loading methods were not mentioned.

      (2) The method section is incomplete, and many methods like cell culture, cell transfection, gene expression profiling analysis, and splicing analysis, were not introduced in detail.

      (3) The authors should compare their results with previous studies and mention clearly how their work is important in comparison to what has already been reported in the Discussion section.

    4. Reviewer #3 (Public Review):

      Summary:

      Yang et al reported in this paper that TGF-beta induces SIRT4 activation, TGF-beta activated SIRT4 then modulates U2AF2 alternative splicing, U2AF2 in turn causes CCN2 for expression. The mechanism is described as this: mitochondrial SIRT4 transport into the cytoplasm in response to TGF-β stimulation, phosphorylated by ERK in the cytoplasm, and pathway and then undergo nuclear translocation by forming the complex with importin α1. In the nucleus, SIRT4 can then deacetylate U2AF2 at K413 to facilitate the splicing of CCN2 pre-mRNA to promote CCN2 protein expression. Moreover, they used exosomes to deliver Sirt4 antibodies to mitigate renal fibrosis in a mouse model. TGF-beta has been widely reported for its role in fibrosis induction.

      Strengths:

      TGF-beta induction of SIRT4 translocation from mitochondria to nuclei for epigenetics or gene regulation remains largely unknown. The findings presented here that SIRT4 is involved in U2AF2 deacetylation and CCN2 expression are interesting.

      Weaknesses:

      SIRT4 plays a critical role in mitochondria involved in respiratory chain reaction. This role of SIRT4 is critically involved in many cell functions. It is hard to rule out such a mitochondrial activity of SIRT4 in renal fibrosis. Moreover, the major concern is what kind of message mitochondrial SIRT4 proteins receive from TGF-beta. Although nuclear SIRT4 is increased in response to TNF treatment, it is likely de novo synthesized SIRT4 proteins can also undergo nuclear translocation upon cytokine stimulation. TGF-beta-induced mitochondrial calcium uptake and acetyl-CoA should be evaluated for calcium and acetyl-CoA may contribute to the gene expression regulation in nuclei.

    1. eLife assessment

      Zhao et al. report valuable adverse effects on cell proliferation, differentiation and gene expression, possibly linked to reduced binding activity of the transcription factor GTF2IRD1 to the transthyretin (TTR) promoter, in a human forebrain organoid model of Williams Syndrome (WS). The authors provide incomplete evidence of the effects of GTF2IRD1, a mutated gene in WS, on altering MAPK/ERK pathway activity, a well-recognized target in cell proliferation.

    2. Reviewer #1 (Public Review):

      Summary:

      Zhao et al. used the human forebrain organoid model, transgenic mice model, and embryonic neural progenitor cells to investigate the mutation previously identified in Williams Syndrome. They found abnormal proliferation and differentiation induced by this mutation, as well as altered expression profiles corresponding with aberrant cell clusters. This is regulated through the binding of GTF2IRD1 to transthyretin (TTR) promoter regions and tested on three models mentioned above on neurodevelopmental deficits.

      Strengths:

      Authors have applied both cell culture, organoid culture and in vivo model to test the previously reported mutation found in Williams Syndrome. They investigated cell behavior including proliferation and differentiation, while using the NGS technique to identify potential signaling pathways that are highly involved and can serve as a candidate to save the phenotype.

    3. Reviewer #2 (Public Review):

      Summary:

      The study by Xingsen Zhao et al on "A human forebrain organoid model reveals the essential function of GTF2IRD1-TTR-ERK axis for the neurodevelopmental deficits of Williams Syndrome" presents a forebrain organoid model for WS and has identified defects in neurogenesis. The authors have performed scRNAseq from these patients' derived forebrain organoids showing upregulation expression in genes related to cell proliferation while genes involved in neuronal differentiation were downregulated. The major findings presented in this study are an increase in the size of SOX2+ ventricular zone in WS forebrain organoids with an altered developmental trajectory and aberrant excitatory neurogenesis. The study also presents evidence that transthyretin (TTR) has a reduced expression in WS organoids, and its expression is regulated by the transcription factor -GTF2IRD1. The authors then go on identity mechanistic details of TTR function on MAPK/ERK pathway which has been known to be involved in brain development. Overall, this is a well-constructed study revealing the function of one of the key genes that is deleted in WS and provides novel insights into mechanisms underlying the abnormal neurogenesis in WS brain.

      Strengths:

      WS patients have neurocognitive disorders which most likely stem from defects in early neurodevelopment. This study has investigated a WS forebrain organoid model with scRNAseq and identified differences in cell proliferation and differentiation. This study has presented some new evidence regarding the function and regulation of TTR and its regulator GTF2IRD1 during brain development.

      Weaknesses:

      Though the evidence presented for the mechanism of action of TTR on the MAPK pathway is unclear and lacks depth. It would require identifying downstream targets of TTR and how it interacts with the MAPK pathway.

    1. eLife assessment

      The authors present a potentially useful approach of broad interest arguing that anterior cingulate cortex (ACC) tracks option values in decisions involving delayed rewards. The authors introduce the idea of a resource-based cognitive effort signal in ACC ensembles and link ACC theta oscillations to a resistance-based strategy. The evidence supporting these new ideas is incomplete and would benefit from additional detail and more rigorous analyses and computational methods.

    2. Reviewer #1 (Public Review):

      Summary:

      Young (2.5 mo [adolescent]) rats were tasked to either press one lever for immediate reward or another for delayed reward. The task had a complex structure in which (1) the number of pellets provided on the immediate reward lever changed as a function of the decisions made, (2) rats were prevented from pressing the same lever three times in a row. Importantly, this task is very different from most intertemporal choice tasks which adjust delay (to the delayed lever), whereas this task held the delay constant and adjusted the number of 20 mg sucrose pellets provided on the immediate value lever.

      Analyses are based on separating sessions into groups, but group membership includes arbitrary requirements and many sessions have been dropped from the analyses. Computational modeling is based on an overly simple reinforcement learning model, as evidenced by fit parameters pegging to the extremes. The neural analysis is overly complex and does not contain the necessary statistics to assess the validity of their claims.

      Strengthes:

      The task is interesting.

      Weaknesses:

      Behavior:

      The basic behavioral results from this task are not presented. For example, "each recording session consisted of 40 choice trials or 45 minutes". What was the distribution of choices over sessions? Did that change between rats? Did that change between delays? Were there any sequence effects? (I recommend looking at reaction times.) Were there any effects of pressing a lever twice vs after a forced trial? This task has a very complicated sequential structure that I think I would be hard pressed to follow if I were performing this task. Before diving into the complex analyses assuming reinforcement learning paradigms or cognitive control, I would have liked to have understood the basic behaviors the rats were taking. For example, what was the typical rate of lever pressing? If the rats are pressing 40 times in 45 minutes, does waiting 8s make a large difference?

      For that matter, the reaction time from lever appearance to lever pressing would be very interesting (and important). Are they making a choice as soon as the levers appear? Are they leaning towards the delay side, but then give in and choose the immediate lever? What are the reaction time hazard distributions?

      It is not clear that the animals on this task were actually using cognitive control strategies on this task. One cannot assume from the task that cognitive control is key. The authors only consider a very limited number of potential behaviors (an overly simple RL model). On this task, there are a lot of potential behavioral strategies: "win-stay/lose-shift", "perseveration", "alternation", even "random choices" should be considered.

      The delay lever was assigned to the "non-preferred side". How did side bias affect the decisions made?

      The analyses based on "group" are unjustified. The authors compare the proportion of delayed to immediate lever press choices on the non-forced trials and then did k-means clustering on this distribution. But the distribution itself was not shown, so it is unclear whether the "groups" were actually different. They used k=3, but do not describe how this arbitrary number was chosen. (Is 3 the optimal number of clusters to describe this distribution?) Moreover, they removed three group 1 sessions with an 8s delay and two group 2 sessions with a 4s delay, making all the group 1 sessions 4s delay sessions and all group 2 sessions 8s delay sessions. They then ignore group 3 completely. These analyses seem arbitrary and unnecessarily complex. I think they need to analyze the data by delay. (How do rats handle 4s delay sessions? How do rats handle 6s delay sessions? How do rats handle 8s delay sessions?). If they decide to analyze the data by strategy, then they should identify specific strategies, model those strategies, and do model comparison to identify the best explanatory strategy. Importantly, the groups were session-based, not rat based, suggesting that rats used different strategies based on the delay to the delayed lever.

      The reinforcement learning model used was overly simple. In particular, the RL model assumes that the subjects understand the task structure, but we know that even humans have trouble following complex task structures. Moreover, we know that rodent decision-making depends on much more complex strategies (model-based decisions, multi-state decisions, rate-based decisions, etc). There are lots of other ways to encode these decision variables, such as softmax with an inverse temperature rather than epsilon-greedy. The RL model was stated as a given and not justified. As one critical example, the RL model fit to the data assumed a constant exponential discounting function, but it is well-established that all animals, including rodents, use hyperbolic discounting in intertemporal choice tasks. Presumably this changes dramatically the effect of 4s and 8s. As evidence that the RL model is incomplete, the parameters found for the two groups were extreme. (Alpha=1 implies no history and only reacting to the most recent event. Epsilon=0.4 in an epsilon-greedy algorithm is a 40% chance of responding randomly.)

      The authors do add a "dbias" (which is a preference for the delayed lever) term to the RL model, but note that it has to be maximal in the 4s condition to reproduce group 2 behavior, which means they are not doing reinforcement learning anymore, just choosing the delayed lever.

      Neurophysiology:

      The neurophysiology figures are unclear and mostly uninterpretable; they do not show variability, statistics or conclusive results.

      As with the behavior, I would have liked to have seen more traditional neurophysiological analyses first. What do the cells respond to? How do the manifolds change aligned to the lever presses? Are those different between lever presses? Are there changes in cellular information (both at the individual and ensemble level) over time in the session? How do cellular responses differ during that delay while both levers are out, but the rats are not choosing the immediate lever?

      Figure 3, for example, claims that some of the principal components tracked the number of pellets on the immediate lever ("ival"), but they are just two curves. No statistics, controls, or justification for this is shown. BTW, on Figure 3, what is the event at 200s?

      I'm confused. On Figure 4, the number of trials seems to go up to 50, but in the methods, they say that rats received 40 trials or 45 minutes of experience.

      At the end of page 14, the authors state that the strength of the correlation did not differ by group and that this was "predicted" by the RL modeling, but this statement is nonsensical, given that the RL modeling did not fit the data well, depended on extreme values. Moreover, this claim is dependent on "not statistically detectable", which is, of course, not interpretable as "not different".

      There is an interesting result on page 16 that the increases in theta power were observed before a delayed lever press but not an immediate lever press, and then that the theta power declined after an immediate lever press. These data are separated by session group (again group 1 is a subset of the 4s sessions, group 2 is a subset of the 8s sessions, and group 3 is ignored). I would much rather see these data analyzed by delay itself or by some sort of strategy fit across delays. That being said, I don't see how this description shows up in Figure 6. What does Figure 6 look like if you just separate the sessions by delay?

      Discussion:

      Finally, it is unclear to what extent this task actually gets at the questions originally laid out in the goals and returned to in the discussion. The idea of cognitive effort is interesting, but there is no data presented that this task is cognitive at all. The idea of a resourced cognitive effort and a resistance cognitive effort is interesting, but presumably the way one overcomes resistance is through resource-limited components, so it is unclear that these two cognitive effort strategies are different.

      The authors state that "ival-tracking" (neurons and ensembles that presumably track the number of pellets being delivered on the immediate lever - a fancy name for "expectations") "taps into a resourced-based form of cognitive effort", but no evidence is actually provided that keeping track of the expectation of reward on the immediate lever depends on attention or mnemonic resources. They also state that a "dLP-biased strategy" (waiting out the delay) is a "resistance-based form of cognitive effort" but no evidence is made that going to the delayed side takes effort.

      The authors talk about theta synchrony, but never actually measure theta synchrony, particularly across structures such as amygdala or ventral hippocampus. The authors try to connect this to "the unpleasantness of the delay", but provide no measures of pleasantness or unpleasantness. They have no evidence that waiting out an 8s delay is unpleasant.

      The authors hypothesize that the "ival-tracking signal" (the expectation of number of pellets on the immediate lever) "could simply reflect the emotional or autonomic response". Aside from the fact that no evidence for this is provided, if this were to be true, then, in what sense would any of these signals be related to cognitive control?

    3. Reviewer #2 (Public Review):

      Summary:

      This manuscript explores the neuronal signals that underlie resistance vs resource-based models of cognitive effort. The authors use a delayed discounting task and computational models to explore these ideas. The authors find that the ACC strongly tracks value and time, which is consistent with prior work. Novel contributions include quantification of a resource-based control signal among ACC ensembles, and linking ACC theta oscillations to a resistance-based strategy.

      Strengths:

      The experiments and analyses are well done and have the potential to generate an elegant explanatory framework for ACC neuronal activity. The inclusion of local-field potential / spike-field analyses is particularly important because these can be measured in humans.

      Weaknesses:

      I had questions that might help me understand the task and details of neuronal analyses.

      (1) The abstract, discussion, and introduction set up an opposition between resource and resistance-based forms of cognitive effort. It's clear that the authors find evidence for each (ACC ensembles = resource, theta=resistance?) but I'm not sure where the data fall on this dichotomy.<br /> a. An overall very simple schematic early in the paper (prior to the MCML model? or even the behavior) may help illustrate the main point.<br /> b. In the intro, results, and discussion, it may help to relate each point to this dichotomy.<br /> c. What would resource-based signals look like? What would resistance based signals look like? Is the main point that resistance-based strategies dominate when delays are short, but resource-based strategies dominate when delays are long?<br /> d. I wonder if these strategies can be illustrated? Could these two measures (dLP vs ival tracking) be plotted on separate axes or extremes, and behavior, neuronal data, LFP, and spectral relationships be shown on these axes? I think Figure 2 is working towards this. Could these be shown for each delay length? This way, as the evidence from behavior, model, single neurons, ensembles, and theta is presented, it can be related to this framework, and the reader can organize the findings.

      (2) The task is not clear to me.<br /> a. I wonder if a task schematic and a flow chart of training would help readers.<br /> b. This task appears to be relatively new. Has it been used before in rats (Oberlin and Grahame is a mouse study)? Some history / context might help orient readers.<br /> c. How many total sessions were completed with ascending delays? Was there criteria for surgeries? How many total recording sessions per animal (of the 54?)<br /> d. How many trials completed per session (40 trials OR 45 minutes)? Where are there errors? These details are important for interpreting Figure 1.

      (3) Figure 1 is unclear to me.<br /> a. Delayed vs immediate lever presses are being plotted - but I am not sure what is red, and what is blue. I might suggest plotting each animal.<br /> b. How many animals and sessions go into each data point?<br /> c. Table 1 (which might be better referenced in the paper) refers to rats by session. Is it true that some rats (2 and 8) were not analyzed for the bulk of the paper? Some rats appear to switch strategies, and some stay in one strategy. How many neurons come from each rat?<br /> d. Task basics - RT, choice, accuracy, video stills - might help readers understand what is going into these plots<br /> e. Does the animal move differently (i.e., RTs) in G1 vs. G2?

      (4) I wasn't sure how clustered G1 vs. G2 vs G3 are. To make this argument, the raw data (or some axis of it) might help.<br /> a. This is particularly important because G3 appears to be a mix of G1 and G2, although upon inspection, I'm not sure how different they really are<br /> b. Was there some objective clustering criteria that defined the clusters?<br /> c. Why discuss G3 at all? Can these sessions be removed from analysis?

      (5) The same applies to neuronal analyses in Fig 3 and 4<br /> a. What does a single neuron peri-event raster look like? I would include several of these.<br /> b. What does PC1, 2 and 3 look like for G1, G2, and G3?<br /> c. Certain PCs are selected, but I'm not sure how they were selected - was there a criteria used? How was the correlation between PCA and ival selected? What about PCs that don't correlate with ival?<br /> d. If the authors are using PCA, then scree plots and PETHs might be useful, as well as comparisons to PCs from time-shuffled / randomized data.

      (6) I had questions about the spectral analysis<br /> a. Theta has many definitions - why did the authors use 6-12 Hz? Does it come from the hippocampal literature, and is this the best definition of theta?. What about other bands (delta - 1-4 Hz), theta (4-7 Hz); and beta - 13- 30 Hz? These bands are of particular importance because they have been associated with errors, dopamine, and are abnormal in schizophrenia and Parkinson's disease.<br /> b. Power spectra and time-frequency analyses may justify the authors focus. I would show these (y-axis - frequency, x-axis - time, z-axis, power).

      (7) PC3 as an autocorrelation doesn't seem the to be right way to infer theta entrainment or spike-field relationships, as PCA can be vulnerable to phantom oscillations, and coherence can be transient. It is also difficult to compare to traditional measures of phase-locking. Why not simply use spike-field coherence? This is particularly important with reference to the human literature, which the authors invoke.

    4. Reviewer #3 (Public Review):

      Summary:

      The study investigated decision making in rats choosing between small immediate rewards and larger delayed rewards, in a task design where the size of the immediate rewards decreased when this option was chosen and increased when it was not chosen. The authors conceptualise this task as involving two different types of cognitive effort; 'resistance-based' effort putatively needed to resist the smaller immediate reward, and 'resource-based' effort needed to track the changing value of the immediate reward option. They argue based on analyses of the behaviour, and computational modelling, that rats use different strategies in different sessions, with one strategy in which they consistently choose the delayed reward option irrespective of the current immediate reward size, and another strategy in which they preferentially choose the immediate reward option when the immediate reward size is large, and the delayed reward option when the immediate reward size is small. The authors recorded neural activity in anterior cingulate cortex (ACC) and argue that ACC neurons track the value of the immediate reward option irrespective of the strategy the rats are using. They further argue that the strategy the rats are using modulates their estimated value of the immediate reward option, and that oscillatory activity in the 6-12Hz theta band occurs when subjects use the 'resistance-based' strategy of choosing the delayed option irrespective of the current value of the immediate reward option. If solid, these findings will be of interest to researchers working on cognitive control and ACCs involvement in decision making. However, there are some issues with the experiment design, reporting, modelling and analysis which currently preclude high confidence in the validity of the conclusions.

      Strengths:

      The behavioural task used is interesting and the recording methods should enable the collection of good quality single unit and LFP electrophysiology data. The authors recorded from a sizable sample of subjects for this type of study. The approach of splitting the data into sessions where subjects used different strategies and then examining the neural correlates of each is in principle interesting, though I have some reservations about the strength of evidence for the existence of multiple strategies.

      Weaknesses:

      The dataset is very unbalanced in terms of both the number of sessions contributed by each subject, and their distribution across the different putative behavioural strategies (see table 1), with some subjects contributing 9 or 10 sessions and others only one session, and it is not clear from the text why this is the case. Further, only 3 subjects contribute any sessions to one of the behavioural strategies, while 7 contribute data to the other such that apparent differences in brain activity between the two strategies could in fact reflect differences between subjects, which could arise due to e.g. differences in electrode placement. To firm up the conclusion that neural activity is different in sessions where different strategies are thought to be employed, it would be important to account for potential cross-subject variation in the data. The current statistical methods don't do this as they all assume fixed effects (e.g. using trials or neurons as the experimental unit and ignoring which subject the neuron/trial came from).

      It is not obvious that the differences in behaviour between the sessions characterised as using the 'G1' and 'G2' strategies actually imply the use of different strategies, because the behavioural task was different in these sessions, with a shorter wait (4 seconds vs 8 seconds) for the delayed reward in the G1 strategy sessions where the subjects consistently preferred the delayed reward irrespective of the current immediate reward size. Therefore the differences in behaviour could be driven by difference in the task (i.e. external world) rather than a difference in strategy (internal to the subject). It seems plausible that the higher value of the delayed reward option when the delay is shorter could account for the high probability of choosing this option irrespective of the current value of the immediate reward option, without appealing to the subjects using a different strategy.

      Further, even if the differences in behaviour do reflect different behavioural strategies, it is not obvious that these correspond to allocation of different types of cognitive effort. For example, subjects' failure to modify their choice probabilities to track the changing value of the immediate reward option might be due simply to valuing the delayed reward option higher, rather than not allocating cognitive effort to tracking immediate option value (indeed this is suggested by the neural data). Conversely, if the rats assign higher value to the delayed reward option in the G1 sessions, it is not obvious that choosing it requires overcoming 'resistance' through cognitive effort.

      The RL modelling used to characterise the subject's behavioural strategies made some unusual and arguably implausible assumptions:

      i) The goal of the agent was to maximise the value of the immediate reward option (ival), rather than the standard assumption in RL modelling that the goal is to maximise long-run (e.g. temporally discounted) reward. It is not obvious why the rats should be expected to care about maximising the value of only one of their two choice options rather than distributing their choices to try and maximise long run reward.

      ii) The modelling assumed that the subject's choice could occur in 7 different states, defined by the history of their recent choices, such that every successive choice was made in a different state from the previous choice. This is a highly unusual assumption (most modelling of 2AFC tasks assumes all choices occur in the same state), as it causes learning on one trial not to generalise to the next trial, but only to other future trials where the recent choice history is the same.

      iii) The value update was non-standard in that rather than using the trial outcome (i.e. the amount of reward obtained) as the update target, it instead appeared to use some function of the value of the immediate reward option (it was not clear to me from the methods exactly how the fival and fqmax terms in the equation are calculated) irrespective of whether the immediate reward option was actually chosen.

      iv) The model used an e-greedy decision rule such that the probability of choosing the highest value option did not depend on the magnitude of the value difference between the two options. Typically, behavioural modelling uses a softmax decision rule to capture a graded relationship between choice probability and value difference.

      v) Unlike typical RL modelling where the learned value differences drive changes in subjects' choice preferences from trial to trial, to capture sensitivity to the value of the immediately rewarding option the authors had to add in a bias term which depended directly on this value (not mediated by any trial-to-trial learning). It is not clear how the rat is supposed to know the current trial ival if not by learning over previous trials, nor what purpose the learning component of the model serves if not to track the value of the immediate reward option.

      Given the task design, a more standard modelling approach would be to treat each choice as occurring in the same state, with the (temporally discounted) value of the outcomes obtained on each trial updating the value of the chosen option, and choice probabilities driven in a graded way (e.g. softmax) by the estimated value difference between the options. It would be useful to explicitly perform model comparison (e.g. using cross-validated log-likelihood with fitted parameters) of the authors proposed model against more standard modelling approaches to test whether their assumptions are justified. It would also be useful to use logistic regression to evaluate how the history of choices and outcomes on recent trials affects the current trial choice, and compare these granular aspects of the choice data with simulated data from the model.

      There were also some issues with the analyses of neural data which preclude strong confidence in their conclusions:

      Figure 4I makes the striking claim that ACC neurons track the value of the immediately rewarding option equally accurately in sessions where two putative behavioural strategies were used, despite the behaviour being insensitive to this variable in the G1 strategy sessions. The analysis quantifies the strength of correlation between a component of the activity extracted using a decoding analysis and the value of the immediate reward option. However, as far as I could see this analysis was not done in a cross-validated manner (i.e. evaluating the correlation strength on test data that was not used for either training the MCML model or selecting which component to use for the correlation). As such, the chance level correlation will certainly be greater than 0, and it is not clear whether the observed correlations are greater than expected by chance.

      An additional caveat with the claim that ACC is tracking the value of the immediate reward option is that this value likely correlates with other behavioural variables, notably the current choice and recent choice history, that may be encoded in ACC. Encoding analyses (e.g. using linear regression to predict neural activity from behavioural variables) could allow quantification of the variance in ACC activity uniquely explained by option values after controlling for possible influence of other variables such as choice history (e.g. using a coefficient of partial determination).

      Figure 5 argues that there are systematic differences in how ACC neurons represent the value of the immediate option (ival) in the G1 and G2 strategy sessions. This is interesting if true, but it appears possible that the effect is an artefact of the different distribution of option values between the two session types. Specifically, due to the way that ival is updated based on the subjects' choices, in G1 sessions where the subjects are mostly choosing the delayed option, ival will on average be higher than in G2 sessions where they are choosing the immediate option more often. The relative number of high, medium and low ival trials in the G1 and G2 sessions will therefore be different, which could drive systematic differences in the regression fit in the absence of real differences in the activity-value relationship. I have created an ipython notebook illustrating this, available at: https://notebooksharing.space/view/a3c4504aebe7ad3f075aafaabaf93102f2a28f8c189ab9176d4807cf1565f4e3. To verify that this is not driving the effect it would be important to balance the number of trials at each ival level across sessions (e.g. by subsampling trials) before running the regression.

    5. Author response:

      eLife assessment

      The authors present a potentially useful approach of broad interest arguing that anterior cingulate cortex (ACC) tracks option values in decisions involving delayed rewards. The authors introduce the idea of a resource-based cognitive effort signal in ACC ensembles and link ACC theta oscillations to a resistance-based strategy. The evidence supporting these new ideas is incomplete and would benefit from additional detail and more rigorous analyses and computational methods.

      The reviewers have provided several excellent suggestions and pointed out important shortcomings of our manuscript. We are grateful for their efforts. To address these concerns, we are planning a major revision to the manuscript. In the revision, our goal is to address each of the reviewer’s concerns and codify the evidence for resistance- and resource-based control signals in the rat anterior cingulate cortex. We have provided a nonexhaustive list we plan to address in the point by point responses below.   

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Young (2.5 mo [adolescent]) rats were tasked to either press one lever for immediate reward or another for delayed reward.

      Please note that at the time of testing and training that the rats were > 4 months old.

      The task had a complex structure in which (1) the number of pellets provided on the immediate reward lever changed as a function of the decisions made, (2) rats were prevented from pressing the same lever three times in a row. Importantly, this task is very different from most intertemporal choice tasks which adjust delay (to the delayed lever), whereas this task held the delay constant and adjusted the number of 20 mg sucrose pellets provided on the immediate value lever.

      Several studies parametrically vary the immediate lever (PMID: 39119916, 31654652, 28000083, 26779747, 12270518, 19389183). While most versions of the task will yield qualitatively similar estimates of discounting, the adjusting amount is preferred as it provides the most consistent estimates (PMID: 22445576). More specifically this version of the task avoids contrast effects of that result from changing the delay during the session (PMID: 23963529, 24780379, 19730365, 35661751) which complicates value estimates.

      Analyses are based on separating sessions into groups, but group membership includes arbitrary requirements and many sessions have been dropped from the analyses.

      We are in discussions about how to address this valid concern. This includes simply splitting the data by delay. This approach, however, has conceptual problems that we will also lay out in a full revision.  

      Computational modeling is based on an overly simple reinforcement learning model, as evidenced by fit parameters pegging to the extremes.

      We apologize for not doing a better job of explaining the advantages of this type of model for the present purposes. Nevertheless, given the clear lack of enthusiasm, we felt it was better to simply update the model as suggested by the Reviewers. The straightforward modifications have now been implemented and we are currently in discussion about how the new results fit into the larger narrative.

      The neural analysis is overly complex and does not contain the necessary statistics to assess the validity of their claims.

      We plan to streamline the existing analysis and add statistics, where required, to address this concern.

      Strengths:

      The task is interesting.

      Thank you for the positive comment

      Weaknesses:

      Behavior:

      The basic behavioral results from this task are not presented. For example, "each recording session consisted of 40 choice trials or 45 minutes". What was the distribution of choices over sessions? Did that change between rats? Did that change between delays? Were there any sequence effects? (I recommend looking at reaction times.) Were there any effects of pressing a lever twice vs after a forced trial?

      Animals tend to make more immediate choices as the delay is extended, which is reflected in Figure 1. We will add more detail and additional statistics to address these questions. 

      This task has a very complicated sequential structure that I think I would be hard pressed to follow if I were performing this task.

      Human tasks implement a similar task structure (PMID: 26779747). Please note the response above that outlines the benefits of using of this task.   

      Before diving into the complex analyses assuming reinforcement learning paradigms or cognitive control, I would have liked to have understood the basic behaviors the rats were taking. For example, what was the typical rate of lever pressing? If the rats are pressing 40 times in 45 minutes, does waiting 8s make a large difference?

      This is a good suggestion. However, rats do not like waiting for rewards, even small delays. Going from the 4 à 8 sec delay results in more immediate choices, indicating that the rats will forgo waiting for a smaller reinforcer at the 8 sec delay as compared to the 4 sec.  

      For that matter, the reaction time from lever appearance to lever pressing would be very interesting (and important). Are they making a choice as soon as the levers appear? Are they leaning towards the delay side, but then give in and choose the immediate lever? What are the reaction time hazard distributions?

      These are excellent suggestions. We are looking into implementing them.

      It is not clear that the animals on this task were actually using cognitive control strategies on this task. One cannot assume from the task that cognitive control is key. The authors only consider a very limited number of potential behaviors (an overly simple RL model). On this task, there are a lot of potential behavioral strategies: "win-stay/lose-shift", "perseveration", "alternation", even "random choices" should be considered.

      The strategies the Reviewer mentioned are descriptors of the actual choices the rats made. For example, perseveration means the rat is choosing one of the levers at an excessively high rate whereas alternation means it is choosing the two levers more or less equally, independent of payouts. But the question we are interested in is why? We are arguing that the type of cognitive control determines the choice behavior but cognitive control is an internal variable that guides behavior, rather than simply a descriptor of the behavior. For example, the animal opts to perseverate on the delayed lever because the cognitive control required to track ival is too high. We then searched the neural data for signatures of the two types of cognitive control.

      The delay lever was assigned to the "non-preferred side". How did side bias affect the decisions made?

      The side bias clearly does not impact performance as the animals prefer the delay lever at shorter delays, which works against this bias.

      The analyses based on "group" are unjustified. The authors compare the proportion of delayed to immediate lever press choices on the non-forced trials and then did k-means clustering on this distribution. But the distribution itself was not shown, so it is unclear whether the "groups" were actually different. They used k=3, but do not describe how this arbitrary number was chosen. (Is 3 the optimal number of clusters to describe this distribution?) Moreover, they removed three group 1 sessions with an 8s delay and two group 2 sessions with a 4s delay, making all the group 1 sessions 4s delay sessions and all group 2 sessions 8s delay sessions. They then ignore group 3 completely. These analyses seem arbitrary and unnecessarily complex. I think they need to analyze the data by delay. (How do rats handle 4s delay sessions? How do rats handle 6s delay sessions? How do rats handle 8s delay sessions?). If they decide to analyze the data by strategy, then they should identify specific strategies, model those strategies, and do model comparison to identify the best explanatory strategy. Importantly, the groups were session-based, not rat based, suggesting that rats used different strategies based on the delay to the delayed lever.

      These are excellent points and, as stated above, we are in the process revisiting the group assignments in an effort allay these criticisms.

      The reinforcement learning model used was overly simple. In particular, the RL model assumes that the subjects understand the task structure, but we know that even humans have trouble following complex task structures. Moreover, we know that rodent decision-making depends on much more complex strategies (model-based decisions, multi-state decisions, rate-based decisions, etc). There are lots of other ways to encode these decision variables, such as softmax with an inverse temperature rather than epsilon-greedy. The RL model was stated as a given and not justified. As one critical example, the RL model fit to the data assumed a constant exponential discounting function, but it is well-established that all animals, including rodents, use hyperbolic discounting in intertemporal choice tasks. Presumably this changes dramatically the effect of 4s and 8s. As evidence that the RL model is incomplete, the parameters found for the two groups were extreme. (Alpha=1 implies no history and only reacting to the most recent event. Epsilon=0.4 in an epsilon-greedy algorithm is a 40% chance of responding randomly.)

      Please see our response above. We agree that the approach was not justified, but we do not agree that it is invalid. Simply stated, a softmax approach gives the best fit to the choice behavior, whereas our epsilon-greedy approach attempted to reproduce the choice behavior using a naïve agent that progressively learns the values of the two levers on a choice-by-choice basis. The epsilon-greedy approach can therefore tell us whether it is possible to reproduce the choice behavior by an agent that is only tracking ival. Given our discovery of an ival-tracking signal in ACC, we believed that this was a critical point (although admittedly we did a poor job of communicating it). However, we also appreciate that important insights can be gained by fitting a model to the data as suggested. In fact, we had implemented this approach initially and are currently reconsidering what it can tell us in light of the Reviewers comments.

      The authors do add a "dbias" (which is a preference for the delayed lever) term to the RL model, but note that it has to be maximal in the 4s condition to reproduce group 2 behavior, which means they are not doing reinforcement learning anymore, just choosing the delayed lever.

      Exactly. The model results indicated that a naïve agent that relied only on ival tracking would not behave in this manner. Hence it therefore was unlikely that the G1 animals were using an ival-tracking strategy, even though a strong ival-tracking signal was present in ACC.

      Neurophysiology:

      The neurophysiology figures are unclear and mostly uninterpretable; they do not show variability, statistics or conclusive results.

      While the reviewer is justified in criticizing the clarity of the figures, the statement that “they do not show variability, statistics or conclusive results” is demonstrably false. Each of the figures presented in the manuscript, except Figure 3, are accompanied by statistics and measures of variability. This comment is hyperbolic and not justified.  

      Figure 3 was an attempt to show raw neural data to better demonstrate how robust the ivalue tracking signal is.

      As with the behavior, I would have liked to have seen more traditional neurophysiological analyses first. What do the cells respond to? How do the manifolds change aligned to the lever presses? Are those different between lever presses?

      We provide several figures describing how neurons change firing rates in response to varying reward. We are unsure what the reviewer means by “traditional analysis”, especially since this is immediately followed by a request for an assessment of neural manifolds. That said, we are developing ways to make the analysis more intuitive and, hopefully, more “traditional”.

      Are there changes in cellular information (both at the individual and ensemble level) over time in the session?

      We provide several analyses of how firing rate changes over trials in relation to ival over time in the session.

      How do cellular responses differ during that delay while both levers are out, but the rats are not choosing the immediate lever?

      It is not clear to us how this analysis addresses our hypothesis regarding control signals in ACC.

      Figure 3, for example, claims that some of the principal components tracked the number of pellets on the immediate lever ("ival"), but they are just two curves. No statistics, controls, or justification for this is shown. BTW, on Figure 3, what is the event at 200s?

      Figure 3 will be folded into one of the other figures that contains the summary statistics.

      I'm confused. On Figure 4, the number of trials seems to go up to 50, but in the methods, they say that rats received 40 trials or 45 minutes of experience.

      This analysis included force trials. The max of the session is 40 choice trials. We will clarify in the revised manuscript. 

      At the end of page 14, the authors state that the strength of the correlation did not differ by group and that this was "predicted" by the RL modeling, but this statement is nonsensical, given that the RL modeling did not fit the data well, depended on extreme values. Moreover, this claim is dependent on "not statistically detectable", which is, of course, not interpretable as "not different".

      We plan to revisit this analysis and the RL model.

      There is an interesting result on page 16 that the increases in theta power were observed before a delayed lever press but not an immediate lever press, and then that the theta power declined after an immediate lever press.

      Thank you for the positive comment.

      These data are separated by session group (again group 1 is a subset of the 4s sessions, group 2 is a subset of the 8s sessions, and group 3 is ignored). I would much rather see these data analyzed by delay itself or by some sort of strategy fit across delays.

      Provisional analysis indicates that the results hold up over delays, rather than the groupings in the paper. We will address this in a full revision of the manuscript.

      That being said, I don't see how this description shows up in Figure 6. What does Figure 6 look like if you just separate the sessions by delay?

      We are unclear what the reviewer means by “this description”.

      Discussion:

      Finally, it is unclear to what extent this task actually gets at the questions originally laid out in the goals and returned to in the discussion. The idea of cognitive effort is interesting, but there is no data presented that this task is cognitive at all. The idea of a resourced cognitive effort and a resistance cognitive effort is interesting, but presumably the way one overcomes resistance is through resource-limited components, so it is unclear that these two cognitive effort strategies are different.

      We view the strong evidence for ival tracking presented herein as a potentially critical component of resource based cognitive effort. We hope to clarify how this task engaged cognitive effort more clearly.  

      The authors state that "ival-tracking" (neurons and ensembles that presumably track the number of pellets being delivered on the immediate lever - a fancy name for "expectations") "taps into a resourced-based form of cognitive effort", but no evidence is actually provided that keeping track of the expectation of reward on the immediate lever depends on attention or mnemonic resources. They also state that a "dLP-biased strategy" (waiting out the delay) is a "resistance-based form of cognitive effort" but no evidence is made that going to the delayed side takes effort.

      There is a well-developed literature that rats and mice do not like waiting for delayed reinforcers. We contend that enduring something you don’t like takes effort.

      The authors talk about theta synchrony, but never actually measure theta synchrony, particularly across structures such as amygdala or ventral hippocampus. The authors try to connect this to "the unpleasantness of the delay", but provide no measures of pleasantness or unpleasantness. They have no evidence that waiting out an 8s delay is unpleasant.

      We will better clarify how our measure of Theta power relates to synchrony. There is a well-developed literature that rats and mice do not like waiting for delayed reinforcers.

      The authors hypothesize that the "ival-tracking signal" (the expectation of number of pellets on the immediate lever) "could simply reflect the emotional or autonomic response". Aside from the fact that no evidence for this is provided, if this were to be true, then, in what sense would any of these signals be related to cognitive control?

      This is proposed as an alternative explanation to the ivalue signal. We provide this as a possibility, never a conclusion. We will clarify this in the revised text. 

      Reviewer #2 (Public Review):

      Summary:

      This manuscript explores the neuronal signals that underlie resistance vs resource-based models of cognitive effort. The authors use a delayed discounting task and computational models to explore these ideas. The authors find that the ACC strongly tracks value and time, which is consistent with prior work. Novel contributions include quantification of a resource-based control signal among ACC ensembles, and linking ACC theta oscillations to a resistance-based strategy.

      Strengths:

      The experiments and analyses are well done and have the potential to generate an elegant explanatory framework for ACC neuronal activity. The inclusion of local-field potential / spike-field analyses is particularly important because these can be measured in humans.

      Thank you for the endorsement of our work.

      Weaknesses:

      I had questions that might help me understand the task and details of neuronal analyses.

      (1) The abstract, discussion, and introduction set up an opposition between resource and resistance based forms of cognitive effort. It's clear that the authors find evidence for each (ACC ensembles = resource, theta=resistance?) but I'm not sure where the data fall on this dichotomy.

      a. An overall very simple schematic early in the paper (prior to the MCML model? or even the behavior) may help illustrate the main point.

      b. In the intro, results, and discussion, it may help to relate each point to this dichotomy.

      c. What would resource-based signals look like? What would resistance based signals look like? Is the main point that resistance-based strategies dominate when delays are short, but resource-based strategies dominate when delays are long?

      d. I wonder if these strategies can be illustrated? Could these two measures (dLP vs ival tracking) be plotted on separate axes or extremes, and behavior, neuronal data, LFP, and spectral relationships be shown on these axes? I think Figure 2 is working towards this. Could these be shown for each delay length? This way, as the evidence from behavior, model, single neurons, ensembles, and theta is presented, it can be related to this framework, and the reader can organize the findings.

      These are excellent suggestions, and we intend to implement each of them, where possible.

      (2) The task is not clear to me.

      a. I wonder if a task schematic and a flow chart of training would help readers.

      Yes, excellent idea, we intend to include this.

      b. This task appears to be relatively new. Has it been used before in rats (Oberlin and Grahame is a mouse study)? Some history / context might help orient readers.

      Indeed, this task has been used in rats in several prior studies in rats. Please see the following references (PMID: 39119916, 31654652, 28000083, 26779747, 12270518, 19389183).

      c. How many total sessions were completed with ascending delays? Was there criteria for surgeries? How many total recording sessions per animal (of the 54?)

      Please note that the delay does not change within a session. There was no criteria for surgery. In addition, we will update Table 1 to make the number of recording sessions more clear.

      d. How many trials completed per session (40 trials OR 45 minutes)? Where are there errors? These details are important for interpreting Figure 1.

      Every animal in this data set completed 40 trials. We will update the task description to clarify this issue. There are no errors in this task, but rather the task is designed to the tendency to make an impulsive choice (smaller reward now). We will provide clarity to this issue in the revision of the manuscript.   

      (3) Figure 1 is unclear to me.

      a. Delayed vs immediate lever presses are being plotted - but I am not sure what is red, and what is blue. I might suggest plotting each animal.

      We will clarify the colors and look into schemes to graph the data set.

      b. How many animals and sessions go into each data point?

      This information is in Table 1, but this could be clearer, and we will update the manuscript.

      c. Table 1 (which might be better referenced in the paper) refers to rats by session. Is it true that some rats (2 and 8) were not analyzed for the bulk of the paper? Some rats appear to switch strategies, and some stay in one strategy. How many neurons come from each rat?

      Table 1 is accurate, and we can add the number of neurons from each animal.

      d. Task basics - RT, choice, accuracy, video stills - might help readers understand what is going into these plots

      e. Does the animal move differently (i.e., RTs) in G1 vs. G2?

      We will look into ways to incorporate this information.

      (4) I wasn't sure how clustered G1 vs. G2 vs G3 are. To make this argument, the raw data (or some axis of it) might help.

      a. This is particularly important because G3 appears to be a mix of G1 and G2, although upon inspection, I'm not sure how different they really are

      b. Was there some objective clustering criteria that defined the clusters?

      c. Why discuss G3 at all? Can these sessions be removed from analysis?

      These are all excellent suggestions and points. We plan to revisit the strategy to assign sessions to groups, which we hope will address each of these points.

      (5) The same applies to neuronal analyses in Fig 3 and 4

      a. What does a single neuron peri-event raster look like? I would include several of these.

      b. What does PC1, 2 and 3 look like for G1, G2, and G3?

      c. Certain PCs are selected, but I'm not sure how they were selected - was there a criteria used? How was the correlation between PCA and ival selected? What about PCs that don't correlate with ival?

      d. If the authors are using PCA, then scree plots and PETHs might be useful, as well as comparisons to PCs from time-shuffled / randomized data.

      We will make several updates to enhance clarity of the neural data analysis, including adding more representative examples. We feel the need to balance the inclusion of representative examples with groups stats given the concerns raised by R1.

      (6) I had questions about the spectral analysis

      a. Theta has many definitions - why did the authors use 6-12 Hz? Does it come from the hippocampal literature, and is this the best definition of theta?. What about other bands (delta - 1-4 Hz), theta (4-7 Hz); and beta - 13- 30 Hz? These bands are of particular importance because they have been associated with errors, dopamine, and are abnormal in schizophrenia and Parkinson's disease.

      This designation comes mainly from the hippocampal and ACC literature in rodents. In addition, this range best captured the peak in the power spectrum in our data. Note that we focus our analysis on theta give the literature regarding theta in the ACC as a correlate of cognitive controls (references in manuscript). We did interrogate other bands as a sanity check and the results were mostly limited to theta. Given the scope of our manuscript and the concerns raised regarding complexity we are concerned that adding frequency analyses beyond theta obfuscates the take home message. However, we think this is worthy, and we will determine if this can be done in a brief, clear, and effective manner.

      b. Power spectra and time-frequency analyses may justify the authors focus. I would show these (y-axis - frequency, x-axis - time, z-axis, power).

      This is an excellent suggestion that we look forward to incorporating. 

      (7) PC3 as an autocorrelation doesn't seem the to be right way to infer theta entrainment or spike-field relationships, as PCA can be vulnerable to phantom oscillations, and coherence can be transient. It is also difficult to compare to traditional measures of phase-locking. Why not simply use spike-field coherence? This is particularly important with reference to the human literature, which the authors invoke.

      Excellent suggestion. We will look into the phantom oscillation issue. Note that PCA provided a way to classify neurons that exhibited peaks in the autocorrelation at theta frequencies. While spike-field coherence is a rigorous tool, it addresses a slightly different question (LFP entrainment). Notwithstanding, we plan to address this issue.  

      Reviewer #3 (Public Review):

      Summary:

      The study investigated decision making in rats choosing between small immediate rewards and larger delayed rewards, in a task design where the size of the immediate rewards decreased when this option was chosen and increased when it was not chosen. The authors conceptualise this task as involving two different types of cognitive effort; 'resistance-based' effort putatively needed to resist the smaller immediate reward, and 'resource-based' effort needed to track the changing value of the immediate reward option. They argue based on analyses of the behaviour, and computational modelling, that rats use different strategies in different sessions, with one strategy in which they consistently choose the delayed reward option irrespective of the current immediate reward size, and another strategy in which they preferentially choose the immediate reward option when the immediate reward size is large, and the delayed reward option when the immediate reward size is small. The authors recorded neural activity in anterior cingulate cortex (ACC) and argue that ACC neurons track the value of the immediate reward option irrespective of the strategy the rats are using. They further argue that the strategy the rats are using modulates their estimated value of the immediate reward option, and that oscillatory activity in the 6-12Hz theta band occurs when subjects use the 'resistance-based' strategy of choosing the delayed option irrespective of the current value of the immediate reward option. If solid, these findings will be of interest to researchers working on cognitive control and ACCs involvement in decision making. However, there are some issues with the experiment design, reporting, modelling and analysis which currently preclude high confidence in the validity of the conclusions.

      Strengths:

      The behavioural task used is interesting and the recording methods should enable the collection of good quality single unit and LFP electrophysiology data. The authors recorded from a sizable sample of subjects for this type of study. The approach of splitting the data into sessions where subjects used different strategies and then examining the neural correlates of each is in principle interesting, though I have some reservations about the strength of evidence for the existence of multiple strategies.

      Thank you for the positive comments.

      Weaknesses:

      The dataset is very unbalanced in terms of both the number of sessions contributed by each subject, and their distribution across the different putative behavioural strategies (see table 1), with some subjects contributing 9 or 10 sessions and others only one session, and it is not clear from the text why this is the case. Further, only 3 subjects contribute any sessions to one of the behavioural strategies, while 7 contribute data to the other such that apparent differences in brain activity between the two strategies could in fact reflect differences between subjects, which could arise due to e.g. differences in electrode placement. To firm up the conclusion that neural activity is different in sessions where different strategies are thought to be employed, it would be important to account for potential cross-subject variation in the data. The current statistical methods don't do this as they all assume fixed effects (e.g. using trials or neurons as the experimental unit and ignoring which subject the neuron/trial came from).

      This is an important issue that we plan to address with additional analysis in the manuscript update.

      It is not obvious that the differences in behaviour between the sessions characterised as using the 'G1' and 'G2' strategies actually imply the use of different strategies, because the behavioural task was different in these sessions, with a shorter wait (4 seconds vs 8 seconds) for the delayed reward in the G1 strategy sessions where the subjects consistently preferred the delayed reward irrespective of the current immediate reward size. Therefore the differences in behaviour could be driven by difference in the task (i.e. external world) rather than a difference in strategy (internal to the subject). It seems plausible that the higher value of the delayed reward option when the delay is shorter could account for the high probability of choosing this option irrespective of the current value of the immediate reward option, without appealing to the subjects using a different strategy.

      Further, even if the differences in behaviour do reflect different behavioural strategies, it is not obvious that these correspond to allocation of different types of cognitive effort. For example, subjects' failure to modify their choice probabilities to track the changing value of the immediate reward option might be due simply to valuing the delayed reward option higher, rather than not allocating cognitive effort to tracking immediate option value (indeed this is suggested by the neural data). Conversely, if the rats assign higher value to the delayed reward option in the G1 sessions, it is not obvious that choosing it requires overcoming 'resistance' through cognitive effort.

      The RL modelling used to characterise the subject's behavioural strategies made some unusual and arguably implausible assumptions:

      i) The goal of the agent was to maximise the value of the immediate reward option (ival), rather than the standard assumption in RL modelling that the goal is to maximise long-run (e.g. temporally discounted) reward. It is not obvious why the rats should be expected to care about maximising the value of only one of their two choice options rather than distributing their choices to try and maximise long run reward.

      ii) The modelling assumed that the subject's choice could occur in 7 different states, defined by the history of their recent choices, such that every successive choice was made in a different state from the previous choice. This is a highly unusual assumption (most modelling of 2AFC tasks assumes all choices occur in the same state), as it causes learning on one trial not to generalise to the next trial, but only to other future trials where the recent choice history is the same.

      iii) The value update was non-standard in that rather than using the trial outcome (i.e. the amount of reward obtained) as the update target, it instead appeared to use some function of the value of the immediate reward option (it was not clear to me from the methods exactly how the fival and fqmax terms in the equation are calculated) irrespective of whether the immediate reward option was actually chosen.

      iv) The model used an e-greedy decision rule such that the probability of choosing the highest value option did not depend on the magnitude of the value difference between the two options. Typically, behavioural modelling uses a softmax decision rule to capture a graded relationship between choice probability and value difference.

      v) Unlike typical RL modelling where the learned value differences drive changes in subjects' choice preferences from trial to trial, to capture sensitivity to the value of the immediately rewarding option the authors had to add in a bias term which depended directly on this value (not mediated by any trial-to-trial learning). It is not clear how the rat is supposed to know the current trial ival if not by learning over previous trials, nor what purpose the learning component of the model serves if not to track the value of the immediate reward option.

      Given the task design, a more standard modelling approach would be to treat each choice as occurring in the same state, with the (temporally discounted) value of the outcomes obtained on each trial updating the value of the chosen option, and choice probabilities driven in a graded way (e.g. softmax) by the estimated value difference between the options. It would be useful to explicitly perform model comparison (e.g. using cross-validated log-likelihood with fitted parameters) of the authors proposed model against more standard modelling approaches to test whether their assumptions are justified. It would also be useful to use logistic regression to evaluate how the history of choices and outcomes on recent trials affects the current trial choice, and compare these granular aspects of the choice data with simulated data from the model.

      Each of the issues outlined above with the RL model a very important. We are currently re-evaluating the RL modeling approach in light of these comments. Please see comments to R1 regarding the model as they are relevant for this as well.

      There were also some issues with the analyses of neural data which preclude strong confidence in their conclusions:

      Figure 4I makes the striking claim that ACC neurons track the value of the immediately rewarding option equally accurately in sessions where two putative behavioural strategies were used, despite the behaviour being insensitive to this variable in the G1 strategy sessions. The analysis quantifies the strength of correlation between a component of the activity extracted using a decoding analysis and the value of the immediate reward option. However, as far as I could see this analysis was not done in a cross-validated manner (i.e. evaluating the correlation strength on test data that was not used for either training the MCML model or selecting which component to use for the correlation). As such, the chance level correlation will certainly be greater than 0, and it is not clear whether the observed correlations are greater than expected by chance.

      This is an astute observation and we plan to address this concern. We agree that cross-validation may provide an appropriate tool here.

      An additional caveat with the claim that ACC is tracking the value of the immediate reward option is that this value likely correlates with other behavioural variables, notably the current choice and recent choice history, that may be encoded in ACC. Encoding analyses (e.g. using linear regression to predict neural activity from behavioural variables) could allow quantification of the variance in ACC activity uniquely explained by option values after controlling for possible influence of other variables such as choice history (e.g. using a coefficient of partial determination).

      This is also an excellent point that we plan to address the manuscript update.

      Figure 5 argues that there are systematic differences in how ACC neurons represent the value of the immediate option (ival) in the G1 and G2 strategy sessions. This is interesting if true, but it appears possible that the effect is an artefact of the different distribution of option values between the two session types. Specifically, due to the way that ival is updated based on the subjects' choices, in G1 sessions where the subjects are mostly choosing the delayed option, ival will on average be higher than in G2 sessions where they are choosing the immediate option more often. The relative number of high, medium and low ival trials in the G1 and G2 sessions will therefore be different, which could drive systematic differences in the regression fit in the absence of real differences in the activity-value relationship. I have created an ipython notebook illustrating this, available at: https://notebooksharing.space/view/a3c4504aebe7ad3f075aafaabaf93102f2a28f8c189ab9176d4807cf1565f4e3. To verify that this is not driving the effect it would be important to balance the number of trials at each ival level across sessions (e.g. by subsampling trials) before running the regression.

      Excellent point and thank you for the notebook. We explored a similar approach previously but did not pursue it to completion. We will re-investigate this issue.

    1. Joint Public Review:

      Summary:

      This study retrospectively analyzed clinical data to develop a risk prediction model for pulmonary hypertension in high-altitude populations. This finding holds clinical significance as it can be used for intuitive and individualized prediction of pulmonary hypertension risk in these populations. The strength of evidence is high, utilizing a large cohort of 6,603 patients and employing statistical methods such as LASSO regression. The model demonstrates satisfactory performance metrics, including AUC values and calibration curves, enhancing its clinical applicability.

      Strengths:

      (1) Large Sample Size: The study utilizes a substantial cohort of 6,603 subjects, enhancing the reliability and generalizability of the findings.

      (2) Robust Methodology: The use of advanced statistical techniques, including least absolute shrinkage and selection operator (LASSO) regression and multivariate logistic regression, ensures the selection of optimal predictive features.

      (3) Clinical Utility: The developed nomograms are user-friendly and can be easily implemented in clinical settings, particularly in resource-limited high-altitude regions.

      (4) Performance Metrics: The models demonstrate satisfactory performance, with strong AUC values and well-calibrated curves, indicating accurate predictions.

      Weaknesses:

      (1) Lack of External Validation: The models were validated internally, but external validation with cohorts from other high-altitude regions is necessary to confirm their generalizability.

      (2) Simplistic Predictors: The reliance on ECG and basic demographic data may overlook other potential predictors that could improve the models' accuracy and predictive power.

      (3) Regional Specificity: The study's cohort is limited to Tibet, and the findings may not be directly applicable to other high-altitude populations without further validation.

    1. Reviewer #5 (Public Review):

      After reading the manuscript and the concerns raised by reviewer 2 I see both sides of the argument - the relative location of trigeminal nucleus versus the inferior olive is quite different in elephants (and different from previous studies in elephants), but when there is a large disproportionate magnification of a behaviorally relevant body part at most levels of the nervous system (certainly in the cortex and thalamus), you can get major shifting in the location of different structures. In the case of the elephant, it looks like there may be a lot of shifting. Something that is compelling is that the number of modules separated but the myelin bands correspond to the number of trunk folds which is different in the different elephants. This sort of modular division based on body parts is a general principle of mammalian brain organization (demonstrated beautifully for the cuneate and gracile nucleus in primates, VP in most of species, S1 in a variety of mammals such as the star nosed mole and duck-billed platypus). I don't think these relative changes in the brainstem would require major genetic programming - although some surely exist. Rodents and elephants have been independently evolving for over 60 million years so there is a substantial amount of time for changes in each l lineage to occur.

      I agree that the authors have identified the trigeminal nucleus correctly, although comparisons with more out-groups would be needed to confirm this (although I'm not suggesting that the authors do this). I also think the new figure (which shows previous divisions of the brainstem versus their own) allows the reader to consider these issues for themselves. When reviewing this paper, I actually took the time to go through atlases of other species and even look at some of my own data from highly derived species. Establishing homology across groups based only on relative location is tough especially when there appears to be large shifts in the relative location of structures. My thoughts are that the authors did an extraordinary amount of work on obtaining, processing and analyzing this extremely valuable tissue. They document their work with images of the tissue and their arguments for their divisions are solid. I feel that they have earned the right to speculate - with qualifications - which they provide.

    1. Author response:

      Reviewer #3 (Public Review):

      (1) Conditions on growth and interaction rates for feasibility and stability. The authors approach this using a mean field approximation, and it is important to note that there is no particular temperature dependence assumed here: as far as it goes, this analysis is completely general for arbitrary Lotka-Volterra interactions.

      However, the starting point for the authors' mean field analysis is the statement that "it is not possible to meaningfully link the structure of species interactions to the exact closed-form analytical solution for [equilibria] 𝑥^*_𝑖 in the Lotka-Volterra model.

      I may be misunderstanding, but I don't agree with this statement. The time-independent equilibrium solution with all species present (i.e. at non-zero abundances) takes the form

      x^* = A^{-1}r

      where A is the inverse of the community matrix, and r is the vector of growth rates. The exceptions to this would be when one or more species has abundance = 0, or A is not invertible. I don't think the authors intended to tackle either of these cases, but maybe I am misunderstanding that.

      So to me, the difficulty here is not in writing a closed-form solution for the equilibrium x^*, it is in writing the inverse matrix as a nice function of the entries of the matrix A itself, which is where the authors want to get to. In this light, it looks to me like the condition for feasibility (i.e. that all x^* are positive, which is necessary for an ecologically-interpretable solution) is maybe an approximation for the inverse of A---perhaps valid when off-diagonal entries are small. A weakness then for me was in understanding the range of validity of this approximation, and whether it still holds when off-diagonal entries of A (i.e. inter-specific interactions) are arbitrarily large. I could not tell from the simulation runs whether this full range of off-diagonal values was tested.

      We thank the reviewer for pointing this out and we agree that the language used is imprecise. The GLV model is solvable using the matrix inversion method but as they note, this does not give an interpretable expression in terms of the system parameters. This is important as we aim to build understanding of how these parameters (which in turn depend on temperature) affect the richness in communities. We have made this clearer in lines 372-379.

      In regards to the validity of the approximation we have significantly increased the detail of the method in the manuscript, including the assumptions it makes (lines 384-393). In general the method assumes that any individual interaction has a weak effect on abundance. This will fail when the variation in interactions becomes too strong but should be robust to changes in the average interaction strength across the community.

      As a secondary issue here, it would have been helpful to understand whether the authors' feasible solutions are always stable to small perturbations. In general, I would expect this to be an additional criterion needed to understand diversity, though as the authors point out there are certain broad classes of solutions where feasibility implies stability.

      As the reviewer notes previous work using the GLV model by ? has shown that stability almost surely implies stability in the GLV. Thus we expect that our richness estimates derived from feasibility will closely resemble those from stabiltiy. We have amended the maintext to make this argument clear on lines 321-335.

      (2) I did not follow the precise rationale for selecting the temperature dependence of growth rate and interaction rates, or how the latter could be tested with empirical data, though I do think that in principle this could be a valuable way to understand the role of temperature dependence in the Lotka-Volterra equations.

      First, as the authors note, "the temperature dependence of resource supply will undoubtedly be an important factor in microbial communities"

      Even though resources aren't explicitly modeled here, this suggests to me that at some temperatures, resource supply will be sufficiently low for some species that their growth rates will become negative. For example, if temperature dependence is such that the limiting resource for a given species becomes too low to balance its maintenance costs (and hence mortality rate), it seems that the net growth rate will be negative. The alternative would be that temperature affects resource availability, but never such that a limiting resource leads to a negative growth rate when a taxon is rare.

      On the other hand, the functional form for the distribution of growth rates (eq 3) seems to imply that growth rates are always positive. I could imagine that this is a good description of microbial populations in a setting where the resource supply rate is controlled independently of temperature, but it wasn't clear how generally this would hold.

      We thank the reviewer for their comment. The assumption of positive growth rates is indeed a feature of the Boltzmann-Arrhenius model of temperature dependence. We use the Boltzmann-Arrhenius model due to the dependence of growth on metabolic rate. As metabolic rate is ultimately determined by biochemical kinetics its temper- ature dependence is well described by the Boltzmann-Arrhenius. In addition to this reasoning there is a wealth of empirical evidence supporting the use of the Boltzmann- Arrhenius to describe the temperature dependence of growth rate in microbes.

      Ultimately the temperature dependence of resource supply is not something we can directly consider in our model. As such we have to assume that resource supply is sufficient to maintain positive growth rates in the community. Note that this assump- tion only requires resource supply is sufficient to maintain positive growth rates (i.e. the maximal growth rate of species in isolation) not that resource supply is sufficient to maintain growth in the presence of intra- and interspecific competition. We have updated the manuscript in lines 156-159 to make these assumptions more clear.

      Secondly, while I understand that the growth rate in the exponential phase for a single population can be measured to high precision in the lab as a function of temperature, the assumption for the form of the interaction rates' dependence on temperature seems very hard to test using empirical data. In the section starting L193, the authors seem to fit the model parameters using growth rate dependence on temperature, but then assume that it is reasonable to "use the same thermal response for growth rates and interactions". I did not follow this, and I think a weakness here is in not providing clear evidence that the functional form assumed in Equation (4) actually holds.

      The reviewer is correct, it is very difficult to measure interaction coefficients experi- mentally and to our knowledge there is little to no data available on their empirical temperature responses. We as a best guess use the observed variation in thermal physiology parameters for growth rate as a proxy assuming that interactions must also depend on metabolic rates of the interacting species (see also response to com- ment 8).

    1. eLife assessment

      This important study builds on a previous publication, demonstrating that T. brucei has a continuous endomembrane system, which probably facilitates high rates of endocytosis. Using a range of cutting-edge approaches, the authors present compelling evidence that an actomyosin system, with the myosin TbMyo1 as an active molecular motor, is localized close to and can associate with the endosomal system in the bloodstream form of Trypanosoma brucei. It shows convincingly that both actin and Myo I play a role in the organization and integrity of the endosomal system: both RNAi-mediated depletion of Myo1, and treatment of the cells with latrunculin A resulted in endomembrane disruption. This work should be of interest to cell biologists and microbiologists working on the cytoskeleton, and unicellular eukaryotes.

    1. Author response:

      Reviewer #3 (Public Review):

      The paper by Rai and colleagues examines the transcriptional response of Candida glabrata, a common human fungal pathogen, during interaction with macrophages. They use RNA PolII profiling to identify not just the total transcripts but instead focus on the actively transcribing genes. By examining the profile over time, they identify particular transcripts that are enriched at each timepoint, and build a hierarchical model for how a transcription factor, Xbp1, may regulate this response. Due to technical difficulties in identifying direct targets of Xbp1 during infection, the authors then turn to the targets of Xbp1 during cellular quiescence.

      The authors have generated a large and potentially impactful dataset, examining the responses of C. glabrata during an important host-pathogen interface. However, the conclusions that the authors make are not well supported by the data. The ChIP-seq is interesting, but the authors make conclusions about the biological processes that are differentially regulated without testing them experimentally. Because Candida glabrata has a significant percent of the genome without GO term annotation, the GO term enrichment analysis is less useful than in a model organism. To support these claims, the authors should test the specific phenotypes, and validate that the transcriptional signature is observed at the protein level.

      Additionally, the authors should also include images of the infections, along with measurements of phagocytosis, to show that the time points are the appropriate. At 30 minutes, are C. glabrata actually internalized or just associated? This may explain the difference in adherence genes at the early timepoint. For example, in Lines 123-132, the authors could measure the timing of ROS production by macrophages to determine when these attacks are deployed, instead of speculating based on the increased transcription of DNA damage response genes. Potentially, other factors could be influencing the expression of these proteins. At the late stage of infection, the authors should measure whether the C. glabrata cells are proliferating, or if they have escaped the macrophage, as other fungi can during infection. This may explain some of the increase in transcription of genes related to proliferation.

      An additional limitation to the interpretation of the data is that the authors should put their work in the context of the existing literature on C. albicans temporal adaptation to macrophages, including recent work from Munoz (doi: 10.1038/s41467-019-09599-8), Tucey (doi: 10.1016/j.cmet.2018.03.019), and Tierney (doi: 10.3389/fmicb.2012.00085), among others.

      When comparing the transcriptional profile between WT and xbp1 mutant, it is not clear whether the authors compared the strains under non-stress conditions. The authors should include an analysis of the wild-type to xbp1 mutants in the absence of macrophage stress, as the authors claims of precocious transcription may be a function of overall decreased transcriptional repression, even in the absence of the macrophage stress. The different cut-offs used to call peaks in the two strain backgrounds is also somewhat concerning-it is not clear to me whether that will obscure the transcriptional signature of each of the strains. Additionally, the authors go on to show that the xbp1 mutant has a significant proliferation defect in macrophages, so potentially this could confound the PolII binding sites if the cells are dying.

      In the section on hierarchical analysis of transcription factors, at least one epistasis experiment should have been performed to validate the functional interaction between Xbp1 and a particular transcription factor. If the authors propose a specific motif, they should test this experimentally through EMSA assays to fully test that the motif is functional.

      The jump from macrophages to quiescent culture is also not well justified. If the transcriptional program is so dynamic during a timecourse of macrophage infection, it is hard to translate the findings from a quiescent culture to this host environment.

      Overall, there is a strong beginning and the focus on active transcription in the macrophage is an exciting approach. However, the conclusions need additional experimental evidence.

      We thank this reviewer’s critical analysis of our manuscript and the comments.

      We fully agree that the jump from macrophages to quiescent culture is also not well justified. We have successfully performed CgXbp1 ChIP-seq during macrophage infection and have rewritten the manuscript according to the new results. With the CgXbp1 ChIP-seq data during macrophage infection added, we have removed the data related to quiescence to focus the paper on the macrophage response. Because of this, we have also removed the DNA binding motif analysis from this work and will report the findings in a separate manuscript comparing CgXbp1 bindings between macrophage response and quiescence.

      As mentioned above, the RNAPII ChIP-seq time course experiment compared RNAP occupancies at different times during infection to the first infection time point. We did not calculate relative to the data in the absence of stress (e.g. before infection), because Xbp1 was expressed at a low level and induced by stresses. Hence its role under no stress conditions is expected to be less than inside macrophages. In addition, up-regulation of its target genes depends on the presence of their transcriptional activators under the experimental conditions, which is going to be very different in normal growth media (RPMI or YPD; i.e. before infection) versus inside macrophages. Hence, comparing to normal growth media would not show the real CgXbp1 effects and/or the CgXbp1 effect might be different. In fact, this can be seen from the new RNAseq analysis of wildtype and Cgxbp1∆ C. glabrata cells in the presence and absence of fluconazole (which are added to the revised manuscript to study CgXbp1’s role on fluconazole resistance). The result shows that CgXbp1 (which was expressed at a low level) has a very small effect on global expression and the up-regulated genes are mainly related to transmembrane transport. More importantly, the effect of the Cgxbp1∆ mutant on TCA cycle and amino acid biosynthesis genes’ expression during macrophage infection is not observed when the mutant is grown under normal growth conditions (YPD without fluconazole). Therefore, the results show that CgXbp1 has condition-specific effects on global gene expression, which is also dependent on the transcriptional activators present in the cell. The result of the new RNAseq analysis of wildtype and Cgxbp1∆ C. glabrata cells in the absence of fluconazole is described in lines 329-339 as follows: “On the other hand, 135 genes were differentially expressed in the Cgxbp1∆ mutant during normal exponential growth (i.e. no fluconazole treatment) (Figure 6c) with up-regulated genes highly enriched with the “transmembrane transport” function and down- regulated genes associated with different metabolic processes (e.g. carbohydrate, glycogen and trehalose) (e.g. carbon metabolism, nucleotide metabolism, and transmembrane transport, etc.) (Supplementary Table 12). Interesting, the TCA cycle and amino acid biosynthesis genes, whose expressions were accelerated in the Cgxbp1∆ mutant during macrophage (Figure 3C, 3D), were not affected by the loss of CgXbp1 function under normal growth conditions (i.e. in YPD media without fluconazole) (Supplementary Figure 11, Supplementary Table 11), suggesting that the overall (direct and indirect) effects of CgXbp1 are condition-specific.”

      For the comment about RNAPII bindings affected by dying cells, our observation of reduced proliferation does not mean that the cells were dying, because we did observe increase in cell numbers over time (i.e. the cells were proliferating) but the rate of proliferation was slower in the Cgxbp1∆ mutant comparing to wildtype. Presumably, the reduced proliferation and/or growth within macrophages is due to poorer adaptation in and compromised response to macrophages.

      We have also discussed our findings in the context of the suggested (and other) literatures in various parts of the Discussion.

      Reviewer #4 (Public Review):

      Macrophages are the first line of defense against invading pathogens. C. glabrata must interact with these cells as do all pathogens seeking to establish an infection. Here, a ChIP-seq approach is used to measure levels of RNA polymerase II levels across Cg genes in a macrophage infection assay. Differential gene expression is analyzed with increasing time of infection. These differentially expressed genes are compared at the promoter level to identify potential transcription factors that may be involved in their regulation. A factor called CgXbp1 on the basis of its similar with the S. cerevisiae Xbp1 protein is characterized. ChIP-seq is done on CgXbp1 using in vitro grown cells and a potential binding site identified. Evidence is provided that CgXbp1 affects virulence in a Galleria system and that this factor might impact azole resistance.

      As the authors point out, candidiasis associated with C. glabrata has dramatically increased in the recent past. Understanding the unique aspects of this Candida species would be a great value in trying to unravel the basis of the increasing fungal disease caused by C. glabrata. The use of ChIP-seq analysis to assess the time-dependent association of RNA polymerase II with Cg genes is a nice approach. Identification of CgXbp1 as a potential participant in the control of this gene expression program is also interesting. Unfortunately, this work suffers by comparison to a significant amount of previous effort that renders the progress detailed here incremental at best.

      I agree that their ChIP-seq time course of RNA polymerase II distribution across the Cg genome is both elegant and an improvement on previous microarray experiments. However, these microarray experiments were carried out 14 years ago and while the current work is certainly at higher resolution, little more can be gleaned from the current work. The authors argue that standard transcriptional analysis is compromised by transcript stability effects. I would suggest that, while no approach is without issues, quite a bit has been learned from approaches like RNA-seq and there are recent developments to this technique that allow for a focus on newly synthesized mRNA (thiouridine labeling).

      The CgXbp1 characterization relies heavily on work from S. cerevisiae. This is disappointing as conservation of functional links between C. glabrata and S. cerevisiae is not always predictable.

      The effects caused by loss of CgXBP1 on virulence (Figure 4) may be statistically significant but are modest. No comparison is shown for another gene that has already been accepted to have a role in virulence to allow determination of the biological importance of this effect.

      The phenotypic effects of the loss of XBP1 on azole resistance look rather odd (Figure 6). The appearance of fluconazole resistant colonies in the xbp1 null strain occurs at a very low frequency and seems to resemble the appearance of rho0 cells in the population. The vast majority of xbp1 null cells do not exhibit increased growth compared to wild-type in the presence of fluconazole.

      Irrespective of the precise explanation, more analysis should be performed to confirm that CgXbp1 is negatively regulating the genes suggested in Figure 6A to be responsible for the increased fluconazole resistance.

      Additionally, the entire analysis of CgXbp1 is based on ChIP-seq performed using cells grown under very different conditions that the RNA polymerase II study. Evidence should be provided that the presumptive CgXbp1 target genes actually impact the expression profiles established earlier.

      We thank this reviewer’s critical analysis of our manuscript. We have done the following to address the comments. As a result, the manuscript is significantly improved.

      • The ChIP-seq data of Xbp1 in macrophage has been successfully generated and the result is now presented in Figure 2C-2F, and lines 182-227 of the revised manuscript. With the addition, we have removed the ChIPseq data related to quiescent from the revised manuscript and re-written the manuscript focusing on the role of Xbp1 in macrophage.

      • We agree that the conservation of functional links between C. glabrata and S. cerevisiae is not always predictable. That’s the reason why we did not solely rely on the S. cerevisiae network for inferring Xbp1’s functions, and had undertaken several different ways (e.g. ChIP-seq of Xbp1 and characterization of the Cgxbp1∆ mutant) to delineate its functions.

      • We also agree that the virulence effect is modest, but it is, nevertheless, an effect that may contribute to the overall virulence of C. glabrata. Since virulence is a pleiotropic trait involving many genes and every gene affects different aspects of the complex process, we feel that it is not fair to penalize a given gene based on its (weaker) effect relative to another gene. Therefore, we respectfully disagree that another gene should be included for benchmarking the effect.

      • We have measured C. glabrata cell numbers in a time course experiment. The result (presented in Figure 4A) showed that there was an increase in cell number at the end of the macrophage infection time course experiment (e.g. 8 hr). We have highlighted this information on lines 278-283.

      • Additional analysis of the fluconazole resistance phenotype of the Cgxbp1∆ mutant has been added, including standard MIC assays. The results are presented in Figure 5C-5E.

      • As suggested and to understand the role of CgXbp1 on fluconazole resistance, we have now carried out RNAseq analysis of WT and the Cgxbp1∆ mutant in the presence and absence of fluconazole. The genes differentially controlled in the Cgxbp1∆ mutant have been identified and a proposed model on how CgXbp1 affects fluconazole resistance is added to Figure 7 in the revised manuscript.

    1. Author response:

      Reviewer #1 (Public Review):

      The authors conducted cross-species comparisons between the human brain and the macaque brain to disentangle the specific characteristics of structural development of the human brain. Although previous studies had revealed similarities and differences in brain anatomy between the two species by spatially aligning the brains, the authors made the comparison along the chronological axis by establishing models for predicting the chronological ages with the inputting brain structural features. The rationale is actually clear given that brain development occurs over time in both. More interestingly, the model trained on macaque data was better able to predict the age of humans than the human-trained model was at predicting macaque age. This revealed a brain cross-species age gap (BCAP) that quantified the discrepancy in brain development between the two species, and the authors even found this BCAP measure was associated with performance on behavioral tests in humans. Overall, this study provides important and novel insights into the unique characteristics of human brain development. The authors have employed a rigorous scientific approach, reflecting diligent efforts to scrutinize the patterns of brain age models across species. The clarity of the rationale, the interpretability of the methods, and the quality of the presentation all contribute to the strength of this work.

      We are grateful to your helpful and thorough review and for being so positive about our manuscript. Following your recommendations, we have added more analytic details that have strengthened our paper. We would like to thank you for your input.

      Reviewer #2 (Public Review):

      In the current study, Li et al. developed a novel approach that aligns chronological age to a cross-species brain age prediction model to investigate the evolutionary effect. This method revealed some interesting findings, like the brain-age gap of the macaque model in predicting human age will increase as chronological age increases, suggesting an evolutionary alignment between the macaque brain and the human brain in the early stage of development. This study exhibits ample novelty and research significance. However, I still have some concerns regarding the reliability of the current findings.

      We thank you for the positive and appreciative feedback on our work and the insightful comments, which we have addressed below.

      Question 1: Although the authors named their new method a "cross-species" model, the current study only focused on the prediction between humans and macaques. It would be better to discuss whether their method can also generalize to cross-species examination of other species (e.g., C. elegans), which may provide more comprehensive evolutionary insights. Also, other future directions with their new method are worth discussing.

      We appreciate your insightful comment regarding the generalizability of our model to other species. As you said, we indeed only performed human-macaque cross-species study not including other species. In our study, we only focused human and macaque because macaque is considered to be one of the closest primates to humans except chimpanzees and thus is considered to be the best model for studying human brain evolution. However, our proposed method has limitations that limit its generalizability for other species, e.g., C. elegans. First, our model was trained using MRI data, which limits its applicability to species for which such data is unavailable. This technological requirement brings a barrier to broaden cross-species application. Second, our current model is based on homologous brain atlases that are available for both humans and macaques. The lack of comparable atlases for other species further restricts the model's generalizability. We have discussed this limitation in the revised manuscript and outlined potential future directions to overcome these challenges. This includes discussing the need for developing comparable imaging techniques and standardized brain atlases across a wider range of species to enhance the model's applicability and broaden our understanding of cross-species neurodevelopmental patterns.

      On page 15, lines 11-18

      “However, the existing limitation should be noted regarding the generalizability of our proposed approach for cross-species brain comparison. Our current model relies on homologous brain atlases, and the lack of comparable atlases for other species restricts its broader applicability. To address this limitation, future research should focus on developing prediction models that do not depend on atlases. For instance, 3D convolutional neural networks could be trained directly on raw MRI data for age prediction. These deep learning models may offer greater flexibility for cross-species applications once the training within species is complete. Such advancements would significantly enhance the model's adaptability and expand its potential for comparative neuroscience studies across a wider range of species.”

      Question 2: Algorithm of prediction model. In the method section, the authors only described how they chose features, but did no description about the algorithm (e.g., supporting vector regression) they used. Please add relevant descriptions to the methods.

      Thank you for your comment. We apologize for not providing sufficient details about the model training process in our initial submission. In our study, we used a linear regression model for prediction. We have provided more details regarding the algorithm of prediction model in our response to Reviewer #1. For your convenience, we have attached them below.

      For details on the algorithm of prediction model:

      “A linear regression model was adopted for intra- and inter-species age prediction. The linear regression model was built including the following three main steps: 1) Feature selection: a total of two steps are required to extract the final features. The first step is preliminary extraction. First, all the human or macaque participants were divided into 10-fold and 9-fold was used for model training and 1-fold for model test. The preliminary features were chosen by identifying the significantly age-associated features with p < 0.01 during calculating Pearson’s correlation coefficients between all the 260 features and actual ages of the 9-fold subjects. This process was repeated 100 times. Since we obtained not exactly the same preliminary features each time, we thus further analyzed the preliminary features using two methods to determine the final features: common features and minimum mean absolute error (min MAE). Common features are the preliminary features that were selected in all the 100 times during preliminary model training. The min MAE features were the preliminary features that with the smallest MAE value during the 100 times model test for predicting age. After the above feature selections, we obtained two sets of features: 62 macaque features and 225 human features (common features) and 117 macaque features and 239 human features (min MAE). In addition, to further exclude the influences of unequal number of features in human and macaque, we also selected the first 62 features in human and macaque to test the model prediction performances. 2) Model construction: we conducted age prediction linear model using 10-fold cross-validation based on the selected features for human and macaque separately. The linear model parameters are obtained using the training set data and applied to the test set for prediction. The above process is also repeated 100 times. 3) Prediction: with the above results, we obtained the optimal linear prediction models for human and macaque. Next, we performed intra-species and inter-species brain age prediction, i.e., human model predicted human age, human model predicted macaque age, macaque model predicted macaque age and macaque model predicted human age. Three sets of features (62 macaque features and 225 human features; 117 macaque features and 239 human features; 62 macaque features and 62 human features) were used to test the prediction models for cross-validation and to exclude effects of different number of features in human and macaque. In the main text, we showed the results of brain age prediction, brain developmental and evolutional analyses based on common features and the results obtained using other two types of features were shown in supplementary materials. The prediction performances were evaluated by calculating the Pearson’s correlation and MAE between actual ages and predicted ages.”

      Question 3: Sex difference. The sex difference results are strange to me. For example, in the second row of Figure Supplement 3A, different models show different correlation patterns, but why their Pearson's r is all equal to 0.3939? If they are only typo errors, please correct them. The authors claimed that they found no sex difference. However, the results in Figure Supplement 3 show that, the female seems to have poorer performance in predicting macaque age from the human model. Moreover, accumulated studies have reported sex differences in developing brains (Hines, 2011; Kurth et al., 2021). I think it is also worth discussing why sex differences can't be found in the evolutionary effect.

      Reference:

      Hines, M. (2011). Gender development and the human brain. Annual review of neuroscience, 34, 69-88.

      Kurth, F., Gaser, C., & Luders, E. (2021). Development of sex differences in the human brain. Cognitive Neuroscience, 12(3-4), 155-162.

      It is recommended that the authors explore different prediction models for different species. Maybe macaques are suitable for linear prediction models, and humans are suitable for nonlinear prediction models.

      Thank you for pointing the typos out and comments on sex difference. In Figure Supplement 3A, there are typos for Pearson’s r values and we have corrected it in updated Figure 2-figure supplement 3. For details, please see the updated Figure 2-figure supplement 3 and the following figure.

      Regarding gender effects, we acknowledge your point about the importance of gender differences in understanding brain evolution and development. In our study, however, our primary goal was to develop a robust age prediction model by maximizing the number of training samples. To mitigate gender-related effects in our main results, we incorporated gender information as a covariate in the ComBat harmonization process. We conducted a supplementary analysis just to demonstrate the stability of our proposed cross-species age prediction model by separating the data with gender variable not to investigate gender differences. Although our results demonstrated that gender-specific models could still significantly predict chronological age, we refrained from emphasizing these models' performance in gender-specific species comparisons due to difficulty in explanation for the predicted gender difference. For cross-species prediction, whether a higher Pearson’s r value between actual age and predicted age could reflect conserved evolution for male or female is not convincing. In addition, we adopted same not different prediction models for human and macaque aiming to establish a comparable model between species. Generally speaking, the nonlinear model could obtain better prediction accuracy than linear model. If different species used different models, it is unfair to perform cross-species prediction. Importantly, our study aimed to developed new index based on the same prediction models to quantify brain evolution difference, i.e., brain cross-species age gap (BCAP) instead of traditional statistical analyses. Different prediction models for different species may introduce bias causing by prediction methods and thus impacting the accuracy of BCAP. Thus, we adopted the linear model with best prediction performances for intra-species prediction in this study for cross-species prediction. Although our main goal in this study is to set up stable cross-species prediction model and the models built using either male or female subjects showed good performances during cross-species prediction, however, as your comment, how to unbiasedly characterize evolutionary gender differences using machining learning approaches needs to be further investigated since there are many reports about the gender difference in developing brain in humans. In fact, whether macaque brains have the same gender differences as humans is an interesting scientific question worth studying. Thus, we have included a discussion on how to use machining learning method to study the evolutionary gender difference in our revised manuscript.

      On page 15, lines 18-23 and page 16, line 1-4

      “Many studies have reported sex differences in developing human brains (Hines, 2011; Kurth, Gaser, & Luders, 2021), however, whether macaque brains have similar sex differences as humans is still unknown. We used machining learning method for cross-species prediction to quantify brain evolution and the established prediction models are stable even when only using male or female data, which may indicate that the proposed cross-species prediction model has no evolutionary sex difference. Although the stable prediction model can be established in either male or female participants for cross-species prediction, this indeed does not mean that there are no evolutionary sex differences due to lack of quantitative comparative analysis. In the future, we need to develop more objective, quantifiable and stable index for studying sex differences using machining learning methods to further identify sex differences in the evolved brain”

      Reviewer #3 (Public Review):

      The authors identified a series of WM and GM features that correlated with age in human and macaque structural imaging data. The data was gathered from the HCP and WA studies, which was parcellated in order to yield a set of features. Features that correlated with age were used to train predictive intra and inter-species models of human and macaque age. Interestingly, while each model accurately predicted the corresponding species age, using the macaque model to predict human age was more accurate than the inverse (using the human model to predict macaque age). In addition, the prediction error of the macaque model in predicting human age increased with age, whereas the prediction error of the human model predicting macaque age decreased with age.

      After elaboration of the predictive models, the authors classified the features for prediction into human-specific, macaque-specific and common to human and macaque, where they most notably found that macaque-only and common human-macaque areas were located mainly in gray matter, with only a few human-specific features found in gray matter. Furthermore, the authors found significant correlations between BCAP and picture vocabulary (positive correlation) test and visual sensitivity (negative correlation) test. Several white matter tracts (AF, OR, SLFII) were also identified showing a correlation with BCAP.

      Thank you for providing this excellent summary. We appreciate your thorough review and concise overview of our work.

      STRENGTHS AND WEAKNESSES

      The paper brings an interesting perspective on the evolutionary trajectories of human and non-human primate brain structure, and its relation to behavior and cognition. Overall, the methods are robust and support the theoretical background of the paper. However, the overall clarity of the paper could be improved. There are many convoluted sentences and there seems to be both repetition across the different sections and unclear or missing information. For example, the Introduction does not clearly state the research questions, rather just briefly mentions research gaps existing in the literature and follows by describing the experimental method. It would be desirable to clearly state the theoretical background and research questions and leave out details on methodology. In addition, the results section repeats a lot of what is already stated in the methods. This could be further simplified and make the paper much easier to read.

      In the discussion, authors mention that "findings about cortex expansion are inconsistent and even contradictory", a more convincing argument could be made by elaborating on why the cortex expansion index is inadequate and how BCAP is more accurate.

      Thank you for highlighting the interesting aspects of our work. We are sorry for the lack of the clarity in certain parts of our manuscript. Following your valuable suggestions, we have revised the manuscript to reduce unnecessary repetitions and provide a clearer statement of our research question in Introduction. Specifically, unlike previous analyses of human and macaque evolution using comparative neuroscience, this study embeds chronological axis into the cross-species evolutionary analysis process. It constructed a linear prediction model of brain age for humans and macaques, and quantitatively described the degree of evolution. The brain structure based cross-species age prediction model and cross-species brain age differences proposed in this study further eliminate the inherent developmental effects of humans and macaques on cross-species evolutionary comparisons, providing new perspectives and approaches for studying cross-species development. Regarding the existing repetition in the results section, we have simplified them for the clarity. Regarding the comparison between the cortex expansion index and BCAP, we would like to emphasize that the cortex expansion index was derived without fully considering cross-species alignment along the chronological axis. Specifically, this index does not correspond to a specific developmental stage, but rather focuses on a direct comparison between the two species. In contrast, BCAP addresses this limitation by utilizing a prediction model to establish alignment (or misalignment) between species at the individual level. Therefore, BCAP may serve as a more flexible and nuanced tool for cross-species brain comparison.

      STUDY AIMS AND STRENGTH OF CONCLUSIONS

      Overall, the methods are robust and support the theoretical background of the paper, but it would be good to state the specific research questions -even if exploratory in nature- more specifically. Nevertheless, the results provide support for the research aims.

      Thank you for excellent suggestion. We have revised our introduction to state the specific research question as mentioned above.

      IMPACT OF THE WORK AND UTILITY OF METHODS AND DATA TO THE COMMUNITY

      This study is a good first step in providing a new insight into the neurodevelopmental trajectories of humans and non-human primates besides the existing cortical expansion theories.

      Thank you for your encouraging comment.

      ADDITIONAL CONTEXT:

      It should be clearly stated both in the abstract and methods that the data used for the experiment came from public databases.

      Thank you for your suggestion. We have added this information in both abstract and method. For details, please see page 2, line 9 in Abstract section; page 16, lines 10-11 and page 17, lines 6-10 in Materials and Method section.

    1. Author response:

      Reviewer #1 (Public Review):

      Using structural analysis, Bonchuk and colleagues demonstrate that the TTK-like BTB/POZs of insects form stable hexameric assemblies composed of trimers of POZ dimers, a configuration observed consistently across both homomultimers and heteromultimers, which are known to be formed by TTK-like BTB/POZ domains. The structural data is comprehensive, unambiguous, and further supported by theoretical fold prediction analyses. In particular the judicious complementation of experiments and fold prediction is commendable. This study now adds an important cog that might help generalize the general principles of the evolution of multimerization in members of this fold family.

      I strongly feel that enhancing the inclusivity of the discussion would strengthen the paper. Below, I suggest some additional points for consideration for the same.

      Major points.

      1) It would be valuable to discuss alternative multimer assembly interfaces, considering the diverse ways POZs can multimerize. For instance, the Potassium channel POZ domains form tetramers. A comparison of their inter-subunit interface with that of TTK and non-TTK POZs could provide insightful contrasts.

      Thanks for the suggestion, we added this important comparison, as well as comparison with recently published structures of filament-forming BTB domains.

      2) The so-called TTK motif, despite its unique sequence signature, essentially corresponds to the N-terminal extension observed in other "non-TTK" proteins such as Miz-1. Given Miz-1's structure, it becomes evident that the utilization of the N-terminal extension for dimerization is shared with the TTK family, suggesting a common evolutionary origin in metazoan transcription factors. Early phylogenetic trees (e.g. in PMID: 9917379) support the grouping of the TTK-like POZs with other animal Transcription factors containing POZ domains such as those with Kelch repeats further suggesting that the extension might be ancestral. Structural investigations by modeling prominent examples or comparing known structures of similar POZ domains, could support this inference. Control comparisons with POZ domains from fungi, plants and amoebozoans like Dictyostelium could offer additional insights.

      We performed AlphaFold2-Multimer modeling of dimers of all BTB domains from the most ancestral metazoan clades, Placozoa and Porifera, along with BTBs from Choanoflagellates – the closest to first metazoans unicellular eukaryotes. The presence of N-terminal beta-sheet was evaluated. KLHL-BTBs are present in all eukaryotes and likely are predecessors of ZBTB domains. According to AlphaFold modeling of dimers, all KLHL-BTB domains of plants and basal metazoans have alpha1 helix, but most of these domains from do not possess additional N-terminal beta-strand (beta1) characteristic for ZBTB domains. We found only one KLHL-BTB (Uniprot ID: AA9VCT1_MONBE) with such N-terminal extension in Choanoflagellate proteome, one in Dictyostelium proteome (Q54F31_DICDI), and 7 (out of 43 BTB domains in total) and 13 (out of 81) such domains in Trichoplax and Amphimedon proteomes correspondingly. There was no significant sequence similarity of beta1 element at the level of primary sequence. However, most of these domains bear 3-box/BACK extension and represent typical KLHL-BTBs which are member of E3 ubiquitin-ligase complexes, they are often associated with protein-protein interacting MATH domain or WD40 repeats. We found only one protein in Trichoplax proteome with beta1 strand devoid of 3-box/BACK (B3RQ74_TRIAD), thus resembling ZBTB topology. Thus, likely emergence of BTB domains of this subtype occurred early in Metazoan evolution. At this point ZBTBs were not yet associated with zinc-fingers. According to our survey, actual fusion of ZBTB domain with zinc-finger domains occurred in the evolution of earlier bilaterian organisms since proteins with such domain architecture are not found in Radiata but are present in basal Protostomia and Deuterostomia clades. TTK-type sequence is characteristic only for Arthropoda and emerged early in their evolution. We added all these data to the article.

      3) Exploring the ancestral presence of the aforementioned extension in metazoan transcription factors could serve as a foundation for understanding the evolutionary pathway of hexamerization. This analysis could shed light on exposed structural regions that had the potential to interact post-dimerization with the N-terminal extension and also might provide insights into the evolution of multimer interfaces, as observed in the Potassium channel.

      We added this important comparison as well as comparison with recent structures of filament-forming BTB domains.

      4) Considering the role of conserved residues in the multimer interface is crucial. Reference to conserved residues involved in multimer formation, such as discussed in PMID: 9917379, would enrich the discussion.

      We updated our description of multimer interface with respect to conservation of residues.

      Reviewer #2 (Public Review):

      BTB domains are protein-protein interaction domains found in diverse eukaryotic proteins, including transcription factors. It was previously known that many of the Drosophila transcription factor BTB domains are of the TTK-type - these are defined as having a highly-conserved motif, FxLRWN, at their N-terminus, and they thereby differ from the mammalian BTB domains. Whereas the well-characterised mammalian BTB domains are dimeric, several Drosophila TTK-BTB domains notably form multimers and function as chromosome architectural proteins. The aims of this work were (i) to determine the structural basis of multimerisation of the Drosophila TTK-BTB domains, (ii) to determine how different Drosophila TTK-BTB domains interact with each other, and (iii) to investigate the evolution of this subtype of BTB domain.

      The work significantly advances our understanding of the biology of BTB domains. The conclusions of the paper are mostly well-supported, although some aspects need clarification:

      Hexameric organisation of the TTK-type BTB domains:

      Using cryo-EM, the authors showed that the CG6765 TTK-type BTB domain forms a hexameric assembly in which three "classic" BTB dimers interact via a beta-sheet interface involving the B3 strand. This is particularly interesting, as this region of the BTB domain has recently been implicated in protein-protein interactions in a mammalian BTB-transcription factor, MIZ1. SEC-MALS analysis indicated that the LOLA TTK-type BTB domain is also hexameric, and SAXS data was consistent with a hexameric assembly of the CG6765- and LOLA BTB domains.

      The data regarding the hexameric organisation is convincing. However, interpreting the role of specific regions of the BTB domain is difficult because the description of the molecular contacts lacks depth.

      Heteromeric interactions between TTK-type BTB domains:

      The authors use yeast two-hybrid assays to study heteromeric interactions between various Drosophila TTK-type BTB domains. Such assays are notorious for producing false positives, and this needs to be mentioned. Although the authors suggest that the heteromeric interactions are mediated via the newly-identify B3 interaction interface, there is no evidence to support this, since mutation of B3 yielded insoluble proteins.

      We are aware that Y2H can give false positive results in cases where the BTB domain fused to the DNA binding domain can activate reporter genes. Therefore, all tested BTB domains were examined for their ability to activate transcription. Furthermore, in our study, assays with non-TTK-type BTB domains, which showed almost no interactions, provide additional negative control. We have added a corresponding disclaimer in the text. We agree that our data do not explain the basis for heteromeric interactions. Design of mutations in B3 beta-sheet proved to be complicated, using of biochemical methods to study the principles of heteromer assembly also does not seem to be feasible since most TTK-type BTBs tend to form aggregates and are difficult to be expressed and purified. But most important issue is that demonstrated ability of heteromer assembly through B3 in few tested pairs cannot be applied for all pairs, some of them still may use different mechanism. We used AlphaFold to predict possible mechanisms of heteromer assemblies. AlphaFold suggested that usage of both B3 and conventional dimerization interfaces for heteromeric interactions are possible in various cases, with preference of one over another in different pairs. Thus, most likely the presence of two potential heteromerization interfaces extends the heteromerization capability of these domains. We changed the text accordingly.

      Evolution of the TTK-type BTB domains:

      The authors carried out a bioinformatics analysis of BTB proteins and showed that most of the Drosophila BTB transcription factors (24 out of 28) are of the TTK-type. They investigated how the TTK-type BTB domains emerged during evolution, and showed that these are only found in Arthropoda, and underwent lineage-specific expansion in the modern phylogenetic groups of insects. These findings are well-supported by the evidence.

    2. Reviewer #2 (Public Review):

      BTB domains are protein-protein interaction domains found in diverse eukaryotic proteins, including transcription factors. It was previously known that many of the Drosophila transcription factor BTB domains are of the TTK-type - these are defined as having a highly-conserved motif, FxLRWN, at their N-terminus, and they thereby differ from the mammalian BTB domains. Whereas the well-characterised mammalian BTB domains are dimeric, several Drosophila TTK-BTB domains notably form multimers and function as chromosome architectural proteins. The aims of this work were (i) to determine the structural basis of multimerisation of the Drosophila TTK-BTB domains, (ii) to determine how different Drosophila TTK-BTB domains interact with each other, and (iii) to investigate the evolution of this subtype of BTB domain.

      The work significantly advances our understanding of the biology of BTB domains. The conclusions of the paper are mostly well-supported, although some aspects need clarification:

      Hexameric organisation of the TTK-type BTB domains:<br /> Using cryo-EM, the authors showed that the CG6765 TTK-type BTB domain forms a hexameric assembly in which three "classic" BTB dimers interact via a beta-sheet interface involving the B3 strand. This is particularly interesting, as this region of the BTB domain has recently been implicated in protein-protein interactions in a mammalian BTB-transcription factor, MIZ1. SEC-MALS analysis indicated that the LOLA TTK-type BTB domain is also hexameric, and SAXS data was consistent with a hexameric assembly of the CG6765- and LOLA BTB domains.

      The data regarding the hexameric organisation is convincing. However, interpreting the role of specific regions of the BTB domain is difficult because the description of the molecular contacts lacks depth.

      Heteromeric interactions between TTK-type BTB domains:<br /> The authors use yeast two-hybrid assays to study heteromeric interactions between various Drosophila TTK-type BTB domains. Such assays are notorious for producing false positives, and this needs to be mentioned. Although the authors suggest that the heteromeric interactions are mediated via the newly-identify B3 interaction interface, there is no evidence to support this, since mutation of B3 yielded insoluble proteins.

      Evolution of the TTK-type BTB domains:<br /> The authors carried out a bioinformatics analysis of BTB proteins and showed that most of the Drosophila BTB transcription factors (24 out of 28) are of the TTK-type. They investigated how the TTK-type BTB domains emerged during evolution, and showed that these are only found in Arthropoda, and underwent lineage-specific expansion in the modern phylogenetic groups of insects. These findings are well-supported by the evidence.

    1. Author response:

      Reviewer #1 - Public Review

      This report describes work aiming to delineate multi-modal MRI correlates of psychopathology from a large cohort of children of 9-11 years from the ABCD cohort. While uni-modal characterisations have been made, the authors rightly argue that multi-modal approaches in imaging are vital to comprehensively and robustly capture modes of large-scale brain variation that may be associated with pathology. The primary analysis integrates structural and resting-state functional data, while post-hoc analyses on subsamples incorporate task and diffusion data. Five latent components (LCs) are identified, with the first three, corresponding to p-factor, internal/externalising, and neurodevelopmental Michelini Factors, described in detail. In addition, associations of these components with primary and secondary RSFC functional gradients were identified, and LCs were validated in a replication sample via assessment of correlations of loadings.

      1.1) This work is clearly novel and a comprehensive study of associations within this dataset. Multi-modal analyses are challenging to perform, but this work is methodologically rigorous, with careful implementation of discovery and replication assessments, and primary and exploratory analyses. The ABCD dataset is large, and behavioural and MRI protocols seem appropriate and extensive enough for this study. The study lays out comprehensive associations between MRI brain measures and behaviour that appear to recapitulate the established hierarchical structure of psychopathology.

      We thank Reviewer 1 for appreciating our methods and findings, and we address their suggestions below:

      1.2) The work does have weaknesses, some of them acknowledged. There is limited focus on the strength of observed associations. While the latent component loadings seem reliably reproducible in the behavourial domain, this is considerably less the case in the imaging modalities. A considerable proportion of statistical results focuses on spatial associations in loadings between modalities - it seems likely that these reflect intrinsic correlations between modalities, rather than associations specific to any latent component.

      We appreciate the Reviewer’s comment, and minimized the reporting of correlations between the loadings from the different modalities in the revised Results (specifically subsections on LC1, LC2, and LC3). We now refer to Table S4 in each subsection for this information: “Spatial correlations between modality-specific loadings are reported in Supplementary file 1c.”

      For completeness, we report the intrinsic correlations between the different modalities in Supplementary file 1c (P.19):

      “Lastly, although the current work aimed to reduce intrinsic correlations between variables within a given modality through running a PCA before the PLS approach, intrinsic correlations between measures and modalities may potentially be a remaining factor influencing the PLS solution. We, thus, provided an additional overview of the intrinsic correlations between the different neuroimaging data modalities in the supporting results (Supplementary file 1c).”

      1.3) Assessment of associations with functional gradients is similarly a little hard to interpret. Thus, it is hard to judge the implications for our understanding of the neurophysiological basis of psychopathology and the ability of MRI to provide clinical tools for, say, stratification.

      We now provide additional context, including a rising body of theoretical and empirical work, that outlines the value of functional gradients and cortical hierarchies in the understanding of brain development and psychopathology. Please see P.26.

      “Initially demonstrated at the level of intrinsic functional connectivity (Margulies et al., 2016), follow up work confirmed a similar cortical patterning using microarchitectural in-vivo MRI indices related to cortical myelination (Burt et al., 2018; Huntenburg et al., 2017; Paquola et al., 2019), post-mortem cytoarchitecture (Goulas et al., 2018; Paquola et al., 2020, 2019), or post-mortem microarray gene expression (Burt et al., 2018). Spatiotemporal patterns in the formation and maturation of large-scale networks have been found to follow a similar sensory-to-association axis; moreover, there is the emerging view that this framework may offer key insights into brain plasticity and susceptibility to psychopathology (Sydnor et al., 2021). In particular, the increased vulnerability of transmodal association cortices in late childhood and early adolescence has been suggested to relate to prolonged maturation and potential for plastic reconfigurations of these systems (Paquola et al., 2019; Park et al., 2022b). Between mid-childhood and early adolescence, heteromodal association systems such as the default network become progressively more integrated among distant regions, while being more differentiated from spatially adjacent systems, paralleling the development of cognitive control, as well as increasingly abstract and logical thinking. [...] This suggests that neurodevelopmental difficulties might be related to alterations in various processes underpinned by sensory and association regions, as well as the macroscale balance and hierarchy of these systems, in line with previous findings in several neurodevelopmental conditions, including autism, schizophrenia, as well as epilepsy, showing a decreased differentiation between the two anchors of this gradient (Hong et al., 2019). In future work, it will be important to evaluate these tools for diagnostics and population stratification. In particular, the compact and low dimensional perspective of gradients may provide beneficial in terms of biomarker reliability as well as phenotypic prediction, as previously demonstrated using typically developing cohorts (Hong et al. 2020) On the other hand, it will be of interest to explore in how far alterations in connectivity along sensory-to-transmodal hierarchies provide sufficient graduality to differentiate between specific psychopathologies, or whether they, as the current work suggests, mainly reflect risk for general psychopathology and atypical development.”

      1.4) The observation of a recapitulation of psychopathology hierarchy may be somewhat undermined by the relatively modest strength of the components in the imaging domain.

      We thank the Reviewer for this comment, and now expressed this limitation in the revised Discussion, P.23.

      “The p factor, internalizing, externalizing, and neurodevelopmental dimensions were each associated with distinct morphological and intrinsic functional connectivity signatures, although these relationships varied in strength.”

      1.5) The task fMRI was assessed with a fairly basic functional connectivity approach, not using task timings to more specifically extract network responses.

      In the revised Discussion on P.24, we acknowledge that more in-depth analyses of task-based fMRI may have offered additional insights into state-dependent changes in functional architecture.

      “While the current work derived main imaging signatures from resting-state fMRI as well as grey matter morphometry, we could nevertheless demonstrate associations to white matter architecture (derived from diffusion MRI tractography) and recover similar dimensions when using task-based fMRI connectivity. Despite subtle variations in the strength of observed associations, the latter finding provided additional support that the different behavioral dimensions of psychopathology more generally relate to alterations in functional connectivity. Given that task-based fMRI data offers numerous avenues for analytical exploration, our findings may motivate follow-up work assessing associations to network- and gradient-based response strength and timing with respect to external stimuli across different functional states.”

      1.6) Overall, the authors achieve their aim to provide a detailed multimodal characterisation of MRI correlations of psychopathology. Code and data are available and well organised and should provide a valuable resource for researchers wanting to understand MRI-based neural correlates of psycho-pathology-related behavioural traits in this important age group. It is largely a descriptive study, with comparisons to previous uni-modal work, but without particularly strong testing of neuroscience hypotheses.

      We thank the Reviewer for recognizing the detail and rigor of data-driven study and extensive code and data documentation.

      Reviewer #2 - Public Review

      In "Multi-modal Neural Correlates of Childhood Psychopathology" Krebets et al. integrate multi-modal neuroimaging data using machine learning to delineate dissociable links to diverse dimensions of psychopathology in the ABCD sample. This paper had numerous strengths including a superb use of a large resource dataset, appropriate analyses, beautiful visualizations, clear writing, and highly interpretable results from a data-driven analysis. Overall, I think it would certainly be of interest to a general readership. That being said, I do have several comments for the authors to consider.

      We thank Dr Satterthwaite for the positive evaluation and helpful comments.

      2.1) Out-of-sample testing: while the permutation testing procedure for the PLS is entirely appropriate, without out-of-sample testing the reported effect sizes are likely inflated.

      As discussed in the editorial summary of essential revisions, we agree that out-of-sample prediction indeed provides stronger estimates of generalizability. We assess this by applying the PCA coefficients derived from the discovery cohort imaging data to the replication cohort imaging data. The resulting PCA scores and behavioral data were then z-scored using the mean and standard deviation of the replication cohort. The SVD weights derived from the discovery cohort were applied to the normalized replication cohort data to derive imaging and behavioral composite scores, which were used to recover the contribution of each imaging and behavioral variable to the LCs (i.e., loadings). Out-of-sample replicability of imaging (mean r=0.681, S.D.=0.131) and behavioral (mean r=0.948, S.D.=0.022) loadings was generally high across LCs 1-5. This analysis is reported in the revised manuscript (P.18).

      “Generalizability of reported findings was also assessed by directly applying PCA coefficients and latent components weights from the PLS analysis performed in the discovery cohort to the replication sample data. Out-of-sample prediction was overall high across LCs1-5 for both imaging (mean r=0.681, S.D.=0.131) and behavioral (mean r=0.948, S.D.=0.022) loadings.”

      2.2) Site/family structure: it was unclear how site/family structure were handled as covariates.

      Only unrelated participants were included in discovery and replication samples (see P.6). The site variable was regressed out of the imaging and behavioral data prior to the PLS analysis using the residuals from a multiple linear model which also included age, age2, sex, and ethnicity. This is now clarified on P.29:

      “Prior to the PLS analysis, effects of age, age2, sex, site, and ethnicity were regressed out from the behavioral and imaging data using a multiple linear regression to ensure that the LCs would not be driven by possible confounders (Kebets et al., 2021, 2019; Xia et al., 2018). The imaging and behavioral residuals of this procedure were input to the PLS analysis.”

      2.3) Anatomical features: I was a bit surprised to see volume, surface area, and thickness all evaluated - and that there were several comments on the correspondence between the SA and volume in the results section. Given that cortical volume is simply a product of SA and CT (and mainly driven by SA), this result may be pre-required.

      As suggested, we reduced the reporting of correlations between the loadings from the different modalities in the revised Results (specifically subsections on LC1, LC2, and LC3). Instead, we now refer to Table S4 in each subsection for this information: “Spatial correlations between modality-specific loadings are reported in Supplementary file 1c.”

      We also reran the PLS analysis while only including thickness and surface area as our structural metrics, to account for potential redundancy of these measures with volume. This analysis and associated findings are reported on P.36 and P.19:

      “As cortical volume is a result of both thickness and surface area, we repeated our main PLS analysis while excluding cortical volume from our imaging metrics and report the consistency of these findings with our main model.”

      “Third, to account for redundancy within structural imaging metrics included in our main PLS model (i.e., cortical volume is a result of both thickness and surface area), we also repeated our main analysis while excluding cortical volume from our imaging metrics. Findings were very similar to those in our main analysis, with an average absolute correlation of 0.898±0.114 across imaging composite scores of LCs 1-5.”

      2.4) Ethnicity: the rationale for regressing ethnicity from the data was unclear and may conflict with current best practices.

      We thank the Reviewer for this comment. In light of recent discussions on including this covariate in large datasets such as ABCD (e.g., Saragosa-Harris et al., 2022), we elaborate on our rationale for including this variable in our model in the revised manuscript on P.30:

      “Of note, the inclusion of ethnicity as a covariate in imaging studies has been recently called into question. In the present study, we included this variable in our main model as a proxy for social inequalities relating to race and ethnicity alongside biological factors (age, sex) with documented effects on brain organization and neurodevelopmental symptomatology queried in the CBCL.”

      We also assess the replicability of our analyses when removing race and ethnicity covariates prior to computing the PLS analysis and correlating imaging and behavioral composite scores across both models. We report resulting correlations in the revised manuscript (P.37, 19, and 27):

      “We also assessed the replicability of our findings when removing race and ethnicity covariates prior to computing the PLS analysis and correlating imaging and behavioral composite scores across both models.”

      “Moreover, repeating the PLS analysis while excluding this variable as a model covariate yielded overall similar imaging and behavioral composites scores across LCs to our original analysis. Across LCs 1-5, the average absolute correlations reached r=0.636±0.248 for imaging composite scores, and r=0.715±0.269 for behavioral composite scores. Removing these covariates seemed to exert stronger effects on LC3 and LC4 for both imaging and behavior, as lower correlations across models were specifically observed for these components.”

      “Although we could consider some socio-demographic variables and proxies of social inequalities relating to race and ethnicity as covariates in our main model, the relationship of these social factors to structural and functional brain phenotypes remains to be established with more targeted analyses.”

      2.5) Data quality: the authors did an admirable job in controlling for data quality in the analyses of functional connectivity data. However, it is unclear if a comparable measure of data quality was used for the T1/dMRI analyses. This likely will result in inflated effect sizes in some cases; it has the potential to reduce sensitivity to real effects.

      We agree that data quality was not accounted for in our analysis of T1w- and diffusion-derived metrics. We now accounted for T1w image quality by adding manual quality control ratings to the regressors applied to all structural imaging metrics prior to performing the PLS analysis, and reported the consistency of this new model with original findings. See P.36, P.19:

      “We also considered manual quality control ratings as a measure of T1w scan quality. This metric was included as a covariate in a multiple linear regression model accounting for potential confounds in the structural imaging data, in addition to age, age2, sex, site, ethnicity, ICV, and total surface area. Downstream PLS results were then benchmarked against those obtained from our main model.”

      “Considering scan quality in T1w-derived metrics (from manual quality control ratings) yielded similar results to our main analysis, with an average correlation of 0.986±0.014 across imaging composite scores.”

      As for diffusion imaging, we also regressed out effects of head motion in addition to age, age2, sex, site, and ethnicity from FA and MD measures and reported the consistency with our original results (P.36, P.19):

      “We tested another model which additionally included head motion parameters as regressors in our analyses of FA and MD measures, and assessed the consistency of findings from both models.”

      “Additionally considering head motion parameters from diffusion imaging metrics in our model yielded consistent results to those in our main analyses (mean r=0.891, S.D.=0.103; r=0.733-0.998).”

      Reviewer #3 - Public Review

      In this study, the authors utilized the Adolescent Brain Cognitive Development dataset to investigate the relationship between structural and functional brain network patterns and dimensions of psychopathology. They identified multiple components, including a general psychopathology (p) factor that exhibited a strong association with multimodal imaging features. The connectivity signatures associated with the p factor and neurodevelopmental dimensions aligned with the sensory-to-transmodal axis of cortical organization, which is linked to complex cognition and psychopathology risk. The findings were consistent across two separate subsamples and remained robust when accounting for variations in analytical parameters, thus contributing to a better understanding of the biological mechanisms underlying psychopathology dimensions and offering potential brain-based vulnerability markers.

      3.1) An intriguing aspect of this study is the integration of multiple neuroimaging modalities, combining structural and functional measures, to comprehensively assess the covariance with various symptom combinations. This approach provides a multidimensional understanding of the risk patterns associated with mental illness development.

      We thank the Reviewer for acknowledging the multimodal approach, and for the constructive suggestions.

      3.2) The paper delves deeper into established behavioral latent variables such as the p factor, internalizing, externalizing, and neurodevelopmental dimensions, revealing their distinct associations with morphological and intrinsic functional connectivity signatures. This sheds light on the neurobiological underpinnings of these dimensions.

      We are happy to hear the Reviewer appreciates the gain in understanding neural underpinnings of dimensions of psychopathology resulting from the current work.

      3.3) The robustness of the findings is a notable strength, as they were validated in a separate replication sample and remained consistent even when accounting for different parameter variations in the analysis methodology. This reinforces the generalizability and reliability of the results.

      We appreciate that the Reviewer found our robustness and generalizability assessment convincing.

      3.4) Based on their findings, the authors suggest that the observed variations in resting-state functional connectivity may indicate shared neurobiological substrates specific to certain symptoms. However, it should be noted that differences in resting-state connectivity between groups can stem from various factors, as highlighted in the existing literature. For instance, discrepancies in the interpretation of instructions during the resting state scan can influence the results. Hence, while their findings may indicate biological distinctions, they could also reflect differences in behavior.

      For the ABCD dataset, resting-state fMRI scans were based on eyes open and passive viewing of a crosshair, and are thus homogenized. We acknowledge, however, that there may still be state-to-state fluctuations contributing to the findings, and this is now discussed in the revised Discussion, on P.28. Note, however, that prior literature has generally also suggested rather modest impacts of cognitive and daily variation on resting-state functional networks, compared to much more dominating inter-individual and inter-group factors.

      “Finally, while prior research has shown that resting-state fMRI networks may be affected by differences in instructions and study paradigm (e.g., with respect to eyes open vs closed) (Agcaoglu et al., 2019), the resting-state fMRI paradigm is homogenized in the ABCD study to be passive viewing of a centrally presented fixation cross. It is nevertheless possible that there were slight variations in compliance and instructions that contributed to differences in associated functional architecture. Notably, however, there is a mounting literature based on high-definition fMRI acquisitions suggesting that functional networks are mainly dominated by common organizational principles and stable individual features, with substantially more modest contributions from task-state variability (Gratton et al. 2018). These findings, thus, suggest that resting-state fMRI markers can serve as powerful phenotypes of psychiatric conditions, and potential biomarkers (Abraham et al., 2017; Gratton et al., 2020; Parkes et al., 2020).”

      3.5) The authors conducted several analyses to investigate the relationship between imaging loadings associated with latent components and the principal functional gradient. They found several associations between principal gradient scores and both within- and between-network resting-state functional connectivity (RSFC) loadings. Assessing the analysis presented here proves challenging due to the nature of relating loadings, which are partly based on the RSFC, to gradients derived from RSFC. Consequently, a certain level of correlation between these two variables would be expected, making it difficult to determine the significance of the authors' findings. It would be more intriguing if a direct correlation between the composite scores reflecting behavior and the gradients were to yield statistically significant results.

      We thank the Reviewer for the comment, and agree that investigating gradient-behavior relationships could offer additional insights into the neural basis of psychiatric symptomatology. However, the current analysis pipeline precludes this direct comparison which is performed on a region-by-region basis across the span of the cortical gradient. Indeed, the behavioral loadings are provided for each CBCL item, and not cortical regions.

      The Reviewer also evokes concerns of potential circularity in our analysis, as we compared imaging loadings, which are partially based on RSFC, and gradient values generated from the same RSFC data. In response to this comment, we cross-validated our findings using an RSFC gradient derived from an independent dataset (HCP), showing highly consistent findings to those presented in the manuscript. This correlation is now reported in the Results section P.15.

      “A similar pattern of findings was observed when cross-validating between- and within-network RSFC loadings to a RSFC gradient derived from an independent dataset (HCP), with strongest correlations seen for between-network RSFC loadings for LC1 and LC3 (LC1: r=0.50, pspin<0.001; LC3: r=0.37, pspin<0.001).”

      We furthermore note similar correlations between imaging loadings and T1w/T2w ratio in the same participants, a proxy of intracortical microstructure and hierarchy (Glasser et al., 2011). These findings are now detailed in the revised Results, P.15-16:

      “Of note, we obtain similar correlations when using T1w/T2w ratio in the same participants, a proxy of intracortical microstructure and hierarchy (Glasser et al., 2011). Specifically, we observed the strongest association between this microstructural marker of the cortical hierarchy and between-network RSFC loadings related to LC1 (r=-0.43, pspin<0.001).”

      3.6) Lastly, regarding the interpretation of the first identified latent component, I have some reservations. Upon examining the loadings, it appears that LC1 primarily reflects impulse control issues rather than representing a comprehensive p-factor. Furthermore, it is worth noting that within the field, there is an ongoing debate concerning the interpretation and utilization of the p-factor. An insightful publication on this topic is "The p factor is the sum of its parts, for now" (Fried et al, 2021), which explains that the p-factor emerges as a result of a positive manifold, but it does not necessarily provide insights into the underlying mechanisms that generated the data.

      We thank the Reviewer for this comment, and added greater nuance into the discussion of the association to the p factor. We furthermore discuss some of the ongoing debate about the use of the p factor, and cite the recommended publication on P.27.

      “Other factors have also been suggested to impact the development of psychopathology, such as executive functioning deficits, earlier pubertal timing, negative life events (Brieant et al., 2021), maternal depression, or psychological factors (e.g., low effortful control, high neuroticism, negative affectivity). Inclusion of such data could also help to add further mechanistic insights into the rather synoptic proxy measure of the p factor itself (Fried et al., 2021), and to potentially assess shared and unique effects of the p factor vis-à-vis highly correlated measures of impulse control.”

    1. Author response:

      Reviewer #2 (Public Review):

      This is, to my knowledge, the most scalable method for phylogenetic placement that uses likelihoods. The tool has an inter- esting and innovative means of using gaps, which I haven’t seen before. In the validation the authors demonstrate superior performance to existing tools for taxonomic annotation (though there are questions about the setup of the validation as described below).

      The program is written in C with no library dependencies. This is great. However, I wasn’t able to try out the software because the linking failed on Debian 11, and the binary artifact made by the GitHub Actions pipeline was too recent for my GLIBC/kernel. It’d be nice to provide a binary for people stuck on older kernels (our cluster is still on Ubuntu 18.04). Also, would it be hard to publish your .zipped binaries as packages?

      We have provided a binary (and zipped package) that supports Ubuntu 18.04 in GitHub Actions ( https://github.com/lpipes/tronko/actions/runs/9947708087). This should facilitate the use of our software on older sys- tems like yours. We were not able to test the binary however, since GitHub did not seem to find any nodes with Ubuntu 18.04. It is important to note that Ubuntu 18.04 is deprecated. The latest version of Ubuntu is 24.04, and we recommend users to upgrade to newer, supported versions of their operating systems to benefit from the latest security updates and features.

      Thank you for publishing your source files for the validation on zenodo. Please provide a script that would enable the user to rerun the analysis using those files, either on zenodo or on GitHub somewhere.

      We have posted all datasets as well as scripts to Zenodo.

      The validations need further attention as follows.

      First, the authors have not chosen data sets that are not well-aligned with real-world use cases for this software, and as a re- sult, its applicability is difficult to determine. First, the leave-one-species-out experiment made use of COI gene sequences representing 253 species from the order Charadriiformes, which includes bird species such as gulls and terns. What is the reasoning for selecting this data set given the objective of demonstrating the utility of Tronko for large scale community profiling experiments which by their nature tend to include microorganisms as subjects? If the authors are interested in evaluating COI (or another gene target) as a marker for characterizing the composition of eukaryotic populations, is the heterogeneity and species distribution of bird species within order Charadriiformes comparable to what one would expect in populations of organisms that might actually be the target of a metagenomic analysis?

      Our reasoning for selecting Charadriiformes is that these species are often misidentified for each other and there is a heavy reliance on COI for their species identification. This choice allows us to demonstrate Tronko’s ability to handle difficult and realistic identification challenges. Additionally, we aimed to simulate a challenging dataset to effectively differentiate between the methods used, showcasing Tronko’s robustness. Including more distantly related bird species would have simplified the identification process, which would not serve our objective of demonstrating the utility of Tronko for dis- tinguishing closely related species. It is also important to note that all methods used the exact same reference database which is not always the case in other species assignment comparative studies.

      Furthermore, while our study uses bird species, the principles and techniques applied are broadly applicable to other taxa, including microorganisms. By selecting a datase tknown for its identification difficulties, we underscore Tronko’spotential utility in a wide range of taxonomic profiling scenarios, including those involving high heterogeneity and closely related species, such as in microbial communities.

      Second, It appears that experiments evaluating performance for 16S were limited to reclassification of sequencing data from mock communities described in two publications, Schirmer (2015, 49 bacteria and 10 archaea, all environmental), and Gohl (2016; 20 bacteria - this is the widely used commercial mock community from BEI, all well-known human pathogens or commensals). The authors performed a comparison with kraken2, metaphlan2, and MEGAN using both the default database for each as well as the same database used for Tronko (kudos for including the latter). This pair of experiments provide a reasonable high-level indication of Tronko’s performance relative to other tools, but the total number of organ- isms is very limited, and particularly limited with respect to the human microbiome. It is also important to point out that these mock communities are composed primarily of type strains and provide limited species-level heterogeneity. The per- formance of these classification tools on type strains may not be representative of what one would find in natural samples. Thus, the leave-one-individual-out and leave-one-species-out experiments would have been more useful and informative had they been applied to extended 16S data sets representing more ecologically realistic populations.

      We thank the reviewer for this comment and we have included both an additional bacterial mock community dataset from Lluch et al. (2015) and an additional leave-one-species-out experiment. We describe how this leave-one-species-out dataset was constructed in our previous response to ’Essential Revisions’ #1. We also added Figure 5, S5, and S6.

      Finally, the authors should describe the composition of the databases used for classification as well as the strategy (and toolchain) used to select reference sequences. What databases were the reference sequences drawn from and by what criteria? Were the reference databases designed to reflect the composition of the mock communities (and if so, are they limited to species in those communities, or are additional related species included), or have the authors constructed general pur- pose reference databases? How many representatives of each species were included (on average), and were there efforts to represent a diversity of strains for each species? The methods should include a section detailing the construction of the data sets: as illustrated in this very study, the choice of reference database influences the quality of classification results, and the authors should explain the process and design considerations for database construction.

      To construct our databases, we used CRUX (Curd et al., 2018). This is described in the Methods section under ’Custom 16S and COI Tronko-build reference database construction’. All missing outs tests were downsamples of these two databases. It is beyond the scope of the manuscript to discuss how CRUX works. Additionally, we added the following text:

      To compare the new method (Tronko) to previous methods, we constructed reference databases for COI and 16S for com- mon amplicon primer sets using CRUX (See Methods for exact primers used).

    1. Author response:

      Reviewer #1 (Public Review):

      In this manuscript, Perez-Lopez et al. examine the function of the chemokine CCL28, which is expressed highly in mucosal tissues during infection, but its role during infection is poorly understood. They find that CCL28 promotes neutrophil accumulation in the intestines of mice infected with Salmonella and in the lungs of mice infected with Acinetobacter. They find that Ccl28-/- mice are highly susceptible to Salmonella infection, and highly resistant and protected from lethality following Acinetobacter infection. They find that neutrophils express the CCL28 receptors CCR3 and CCR10. CCR3 was pre-formed and intracellular and translocated to the cell surface following phagocytosis or inflammatory stimuli. They also find that CCL28 stimulation of CCR3 promoted neutrophil antimicrobial activity, ROS production, and NET formation, using a combination of primary mouse and human neutrophils for their studies. Overall, the authors' findings provide new and fundamental insight into the role of the CCL28:CCR3 chemokine:chemokine receptor pair in regulating neutrophil recruitment and effector function during infection with the intestinal pathogen Salmonella Typhimurium and the lung pathogen Acinetobacter baumanii.

      We would like to thank the reviewer for their positive assessment of our work and for providing us with constructive comments that have helped us to improve the manuscript.

      Reviewer #2 (Public Review):

      In this manuscript by Perez-Lopez et al., the authors investigate the role of the chemokine CCL28 during bacterial infections in mucosal tissues. This is a well-written study with exciting results. They show a role for CCL28 in promoting neutrophil accumulation to the guts of Salmonella-infected mice and to the lung of mice infected with Acinetobacter. Interestingly, the functional consequences of CCL28 deficiency differ between infections with the two different pathogens, with CCL28-deficiency increasing susceptibility to Salmonella, but increasing resistance to Acinetobacter. The underlying mechanistic reasons for this suggest roles for CCL28 in enhanced neutrophil antimicrobial activity, production of reactive oxygen species, and formation of extracellular traps. However, additional experiments are required to shore up these mechanisms, including addressing the role of other CCL28-dependent cell types and further characterization of neutrophils from CCL28-deficient mice.

      We would like to thank the reviewer for the positive assessment of our work and for providing us with constructive comments that have helped us to improve the manuscript.

      Reviewer #3 (Public Review):

      The manuscript by Perez-Lopez and colleagues uses a combination of in vivo studies using knockout mice and elegant in vitro studies to explore the role of the chemokine CCL28 during bacterial infection on mucosal surfaces. Using the streptomycin model of Salmonella Typhimurium (S. Tm) infection, the authors demonstrate that CCL28 is required for neutrophil influx in the intestinal mucosa to control pathogen burden both locally and systemically. Interestingly, CCL28 plays the opposite role in a model lung infection by Acinetobacter baumanii, as Ccl28-/- mice are protected from Acinetobacter infection. Authors suggest that the mechanism by which CCL28 plays a role during bacterial infection is due to its role in modulating neutrophil recruitment and function.

      We would like to thank the reviewer for the positive assessment of our work and for providing us with constructive comments that have helped us to improve the manuscript.

      The major strengths of the manuscript are:

      The novelty of the findings that are described in the manuscript. The role of the chemokine CCL28 in modulating neutrophil function and recruitment in mucosal surfaces is intriguing and novel.

      Authors use Ccl28-/- mice in their studies, a mouse strain that has only recently been available. To assess the impact of CCL28 on mucosal surfaces during pathogen-induced inflammation, the authors choose not one but two models of bacterial infection (S. Tm and A. baumanii). This approach increases the rigor and impact of the data presented.

      Authors combine the elegant in vivo studies using Ccl28 -/- with in vitro experiments that explore the mechanisms by which CCL28 affects neutrophil function.

      The major weaknesses of the manuscript in its present form are:

      Authors use different time points in the S. Tm model to characterize the influx of immune cells and pathology. They do not provide a clear justification as to why distinct time points were chosen for their analysis.

      The reviewer raises a good point. As discussed in the detailed response to the reviewers, we have now generated extensive results at different time points and included these in the revised manuscript.

      Authors provide puzzling data that Ccl28-/- mice have the same numbers of CCR3 and CCR10- expressing neutrophils in the mucosa during infection. It is unclear why the lack of CCL28 expression would not affect the recruitment of neutrophils that express the ligands (CCR3 and CCR10) for this chemokine. Thus, these results need to be better explained.

      As discussed in the detailed response to the reviewers, we clarified that Ccl28-/- mice have reduced numbers of neutrophils in the mucosa during infection, but the percentage of CCR3+ and CCR10+ neutrophils does not change. We provide additional discussion of this point in the manuscript and in the response to the reviewers.

      The in vitro studies focus primarily on characterizing how CCL28 affects the function of neutrophils in response to S. Tm infection. There is a lack of data to demonstrate whether Acinetobacter affects CCR3 and CCR10 expression and recruitment to the cell surface and whether CCL28 plays any role in this process.

      We agree and have performed additional studies with Acinetobacter and CCL28, which we discuss in greater detail below in the response to the reviewers.

    1. eLife assessment

      This is a useful study on sex differences in gene expression across organs of four mice taxa, although there are some shortcomings in the data analyses and interpretations that should to be better placed in the broader context of the current literature. Hence, the evidence in the current form is incomplete, with several overstated key conclusions.

    2. Reviewer #1 (Public Review):

      The authors describe a comprehensive analysis of sex-biased expression across multiple tissues and species of mouse. Their results are broadly consistent with previous work, and their methods are robust, as the large volume of work in this area has converged toward a standardized approach.

      I have a few quibbles with the findings, and the main novelty here is the rapid evolution of sex-biased expression over shorter evolutionary intervals than previously documented, although this is not statistically supported. The other main findings, detailed below, are somewhat overstated.

      (1) In the introduction, the authors conflate gametic sex, which is indeed largely binary (with small sperm, large eggs, no intermediate gametic form, and no overlap in size) with somatic sexual dimorphism, which can be bimodal (though sometimes is even more complicated), with a large variance in either sex and generally with a great deal of overlap between males and females. A good appraisal of this distinction is at https://doi.org/10.1093/icb/icad113. This distinction in gene expression has been recognized for at least 20 years, with observations that sex-biased expression in the soma is far less than in the gonad.

      For example, the authors frame their work with the following statement:<br /> "The different organs show a large individual variation in sex-biased gene expression, making it impossible to classify individuals in simple binary terms. Hence, the seemingly strong conservation of binary sex-states does not find an equivalent underpinning when one looks at the gene-expression makeup of the sexes"

      The authors use this conflation to set up a straw man argument, perhaps in part due to recent political discussions on this topic. They seem to be implying one of two things. a) That previous studies of sex-biased expression of the soma claim a binary classification. I know of no such claim, and many have clearly shown quite the opposite, particularly studies of intra-sexual variation, which are common - see https://doi.org/10.1093/molbev/msx293, https://doi.org/10.1371/journal.pgen.1003697, https://doi.org/10.1111/mec.14408, https://doi.org/10.1111/mec.13919, https://doi.org/10.1111/j.1558-5646.2010.01106.x for just a few examples. Or b) They are the first to observe this non-binary pattern for the soma, but again, many have observed this. For example, many have noted that reproductive or gonad transcriptome data cluster first by sex, but somatic tissue clusters first by species or tissue, then by sex (https://doi.org/10.1073/pnas.1501339112, https://doi.org/10.7554/eLife.67485)<br /> Figure 4 illustrates the conceptual difference between bimodal and binary sexual conceptions. This figure makes it clear that males and females have different means, but in all cases the distributions are bimodal.

      I would suggest that the authors heavily revise the paper with this more nuanced understanding of the literature and sex differences in their paper, and place their findings in the context of previous work.

      (2) The authors also claim that "sexual conflict is one of the major drivers of evolutionary divergence already at the early species divergence level." However, making the connection between sex-biased genes and sexual conflict remains fraught. Although it is tempting to use sex-biased gene expression (or any form of phenotypic dimorphism) as an indicator of sexual conflict, resolved or not, as many have pointed out, one needs measures of sex-specific selection, ideally fitness, to make this case (https://doi.org/10.1086/595841, 10.1101/cshperspect.a017632). In many cases, sexual dimorphism can arise in one sex only without conflict (e.g. 10.1098/rspb.2010.2220). As such, sex-biased genes alone are not sufficient to discriminate between ongoing and resolved conflict.

      (3) To make the case that sex-biased genes are under selection, the authors report alpha values in Figure 3B. Alpha value comparisons like this over large numbers of genes often have high variance. Are any of the values for male- female- and un-biased genes significantly different from one another? This is needed to make the claim of positive selection.

    3. Reviewer #2 (Public Review):

      The manuscript by Xie and colleagues presents transcriptomic experiments that measure gene expression in eight different tissues taken from adult female and male mice from four species. These data are used to make inferences regarding the evolution of sex-biased gene expression across these taxa. The experimental methods and data analysis are appropriate; however, most of the conclusions drawn in the manuscript have either been previously reported in the literature or are not fully supported by the data.

      There are two ways the manuscript could be modified to better strengthen the conclusions.

      First, some of the observed differences in gene expression have very little to no effect on other phenotypes, and are not relevant to medicine or fitness. Selectively neutral gene expression differences have been inferred in previous studies, and consistent with that work, sex-biased and between-species expression differences in this study may also be enriched for selectively neutral expression differences. This idea is supported by the analysis of expression variance, which indicates that genes that show sex-biased expression also tend to show more inter-individual variation. This perspective is also supported by the MK analysis of molecular evolution, which suggests that positive selection is more prevalent among genes that are sex-biased in both mus and dom, and genes that switch sex-biased expression are under less selection at the level of both protein-coding sequence and gene expression.

      As an aside, I was confused by (line 176): "implying that the enhanced positive selection pressure is triggered by their status of being sex-biased in either taxon." - don't the MK values suggest an excess of positive selection on genes that are sex-biased in both taxa?

      Without an estimate of the proportion of differentially expressed genes that might be relevant for broader physiological or organismal phenotypes, it is difficult to assess the accuracy and relevance of the manuscript's conclusions. One (crude) approach would be to analyze subsets of genes stratified by the magnitude of expression differences; while there is a weak relationship between expression differences and fitness effects, on average large gene expression differences are more likely to affect additional phenotypes than small expression differences. Another perspective would be to compare the within-species variance to the between-species variance to identify genes with an excess of the latter relative to the former (similar logic to an MK test of amino acid substitutions).

      Second, the analysis could be more informative if it distinguished between genes that are expressed across multiple tissues in both sexes that may show greater expression in one sex than the other, versus genes with specialized function expressed solely in (usually) reproductive tissues of one sex (e.g. ovary-specific genes). One approach to quantify this distinction would be metrics like those used defined by [Yanai I, et al. 2005. Genome-wide midrange transcription profiles reveal expression-level relationships in human tissue specification. Bioinformatics 21:650-659.] These approaches can be used to separate out groups of genes by the extent to which they are expressed in both sexes versus genes that are primarily expressed in sex-specific tissue such as testes or ovaries. This more fine-grained analysis would also potentially inform the section describing the evolution/conservation of sex-biased expression: I expect there must be genes with conserved expression specifically in ovaries or testes (these are ancient animal structures!) but these may have been excluded by the requirement that genes be sex-biased and expressed in at least two organs.

      There are at least three examples of statements in the discussion that at the moment misinterpret the experimental results.

      The discussion frames the results in the context of sexual selection and sexually antagonistic selection, but these concepts are not synonymous. Sexual selection can shape phenotypes that are specific to one sex, causing no antagonism; and fitness differences between males and females resulting from sexually antagonistic variation in somatic phenotypes may not be acted on by sexual selection. Furthermore, the conditions promoting and consequence of both kinds of selection can be different, so they should be treated separately for the purposes of this discussion.

      The discussion claims that "Our data show that sex-biased gene expression evolves extremely fast" but a comparison or expectation for the rate of evolution is not provided. Many other studies have used comparative transcriptomics to estimate rates of gene expression evolution between species, including mice; are the results here substantially and significantly different from those previous studies? Furthermore, the experimental design does not distinguish between those gene expression phenotypes that are fixed between species as compared to those that are polymorphic within one or more species which prevents straightforward interpretation of differences in gene expression as interspecific differences.

      The conclusion that "Our results show that most of the genetic underpinnings of sex differences show no long-term evolutionary stability, which is in strong contrast to the perceived evolutionary stability of two sexes" - seems beyond the scope of this study. This manuscript does not address the genetic underpinnings of sex differences (this would involve eQTL or the like), rather it looks at sex differences in gene expression phenotypes. Simply addressing the question of phenotypic evolutionary stability would be more informative if genes expressed specifically in reproductive tissues were separated from somatic sex-biased genes to determine if they show similar patterns of expression evolution.

    4. Reviewer #3 (Public Review):

      This manuscript reports some interesting and important patterns. The results on sex-bias in different tissues and across four taxa would benefit from alternative (or additional) presentation styles. In my view, the most important results are with respect to alpha (fraction of beneficial amino acid changes) in relation to sex-bias (though the authors have made this as a somewhat minor point in this version).

      The part that the authors emphasize I don't find very interesting (i.e., the sexes have overlapping expression profiles in many nongonadal tissues), nor do I believe they have the appropriate data necessary to convincingly demonstrate this (which would require multiple measures from the same individual).

      This study reports several interesting patterns with respect to sex differences in gene expression across organs of four mice taxa. An alternative presentation of the data would yield a clearer and more convincing case that the patterns the authors claim are legitimate.

      I recommend that the authors clarify what qualifies as "sex-bias".

    5. Author response:

      We appreciate the time of the reviewers and their detailed comments, which will help to improve the manuscript.

      We are sorry that at least one reviewer seems to have had the impression that we have conflated issues about gonadal and non-gonadal sex phenotypes. This referee suggests that we should use Sharpe et al. (2023) to develop our concepts. However, what is discussed in Sharpe et al. was already the guiding principle for our study (without knowing this paper before). In our paper, we introduce the gonadal binary sex (which is self-evidently also the basis for creating the dataset in the first place, because we needed to separate males from females) and go then on to the question of (adult) sex phenotypes for the rest of the paper. The gonadal data are included only as comparison for contrasting the patterns in the non-gonadal tissues.

      Our study presents the largest systematic dataset so far on the evolution of sex-biased gene expression. It is also the first that explores the patterns of individual variation in sex-biased gene expression and the SBI is an entirely new procedure to directly visualize these variance patterns in an intuitive way (note that the relative position of the distributions along the X-axis is indeed not relevant). The results are actually quite nuanced (e.g. the rather dynamv changes seen in mouse kidney and liver comparisons) and go certainly beyond what would have been predictable based on the current literature.

      Also, we should like to point out that our study contradicts recent conclusions that were published in high profile journals, that had suggested that a substantial set of sex-biased genes has conserved functions between humans and mice and that mice can therefore be informative for gender-specific medicine studies. Our data suggest that that only a very small set of genes are conserved in their sex-biased expression. These are epigenetic regulator genes and it will therefore be interesting in the future to focus on their roles in generating the differences between sexual phenotypes in given species.

      We will be happy to use the referee comments to clarify all of these points in a revised version. But we do not think that our "evidence is incomplete" and that there are several "overstated key conclusions". We have used all canonical statistical analyses that are typically used in papers of sex-biased gene expression, as acknowledged by reviewers 1 and 2. The additional statistical analyses that are requested are not within the scope of such papers, but could be subject to separate general studies, independent of the sex-bias analysis (e.g. the role of highly expressed genes versus low expressed genes, or the analysis of the fraction of neutrally evolving loci).

      Finally, it is unclear why the overall rating of the paper is at the lowest possible category ("useful study"), given that it adds a substantial amount of data and new insights into the exploration of the non-binary nature of sexual phenotypes.

    1. eLife assessment

      This paper provides valuable findings related to the impact and timing of exogenous interleukin 2 on the balance of exhausted (Tex) versus effector (Teff) that differentiate from precursors T cells (Tpex) during chronic viral infection. While the data appear solid, the overall claims that IL-2 suppresses Tpex are only partially supported, with the rationale for the timing of IL-2 treatment and its underlying mechanisms remaining unclear.

    2. Reviewer #1 (Public Review):

      Summary:

      The title states "IL-2 enhances effector function but suppresses follicular localization of CD8+ T cells in chronic infection" which data from the paper show but does not seem to be the major goal of the authors. As stated in the short assessment above, the goal of this work seems to connect IL-2 signals, mostly given exogenously, to the differentiation of progenitor T cells (TPEX) that will help sustain effector T cell responses against chronic viral infection (TEX/TEFF). The authors mostly use chronic LCMV infection in mice as their model of choice, Flow cytometry, fluorescent microscopy, and some in vitro assays to explore how IL2 regulates TPEX and TEX/TEFF differentiation. Gain and loss of functions experiments are also conducted to explore the roles of L2 signaling and BLIMP-1 in regulating these processes. Lastly, a loose connection of their mouse findings on TPEX/TEX cells to a clinical study using low-dose IL-2 treatment in SLE patients is attempted.

      Strengths:

      (1) The impact of IL-2 treatment of TPEX/TEX differentiation is very clear.

      (2) The flow cytometry data are convincing and state-of-the-art.

      Weaknesses:

      (1) The title appears disconnected from the major focus of the work.

      (2) The number of TPEX cells is not changed. IL2 treatment increases the number of TEFF and the proportion of TPEX is lower suggesting it does not target TPEX formation. The conclusion about an inhibitory role of IL2 treatment on TPEX formation seems therefore largely overstated.

      (3) Are the expanded TEX/TEFF cells really effectors? Only GrB and some cell surface markers are monitored (44, 62L). Other functions should be included, e.g., CD107a, IFNg, TNF, chemokines - Tbet?

      (4) The rationale for IL2 treatment timing is unclear. Seems that this is given at the T cell contraction time and this is interesting compared to the early treatment that ablate TPEX generation. Maybe this should really be explored further?

      (5) The TGFb/IL6/IL2 in vitro experiment does not bring much to the paper.

      (6) The Figure 2 data try to provide an explanation for a prior lack of difference in viral titers after IL2 treatment. It is hard to be convinced by these tissue section data as presented. It also begs the question of how the host would benefit from the low dose IL-2 treatment if IL-2 TEFF are not contributing to viral control as a result of their inappropriate localization to viral reservoirs.

      (7) It is unclear what the STA5CA and BLIMP-1 KO experiments in Figure 3 add to the story that is not already expected/known.

      (8) The connection to the low-dose IL2 treatment in SLE patients is very loose and weak. This version is likely not the ligand that preferentially signals to CD122 either. SLE is different from a chronic viral infection and the question of timing seems critical from all the data shown in this manuscript. So it is very difficult to make any robust link to the mechanistic data.

      (9) It is really unclear what the take-home message is. IL-2 is signaling via STAT5 and BLIMP1 is also a known target as published by many groups including this one, and these results are more than expected. The observation that TEFF may be differentially localized in the WP area is interesting but no mechanisms are really provided (guessing CXCR5 but again expected). Also, all these observations are highly dependent on the timing of IL2 administration which is fascinating but not explored at all. It also limits significance since underlying mechanisms are unknown and we do not know when such treatment would have to be given.

    3. Reviewer #2 (Public Review):

      This study utilized the LCMV Docile infection model, which induces chronic and persistent infection in mice, leading to T cell exhaustion and dysfunction. Through exogenous IL-2 fusion protein treatment during the late stage of infection, the researchers found that IL-2 treatment significantly enlarges the antigen-specific effector CD8 T cells, expanding the CXCR5-TCF1- exhausted population (Tex) while maintaining the size of the CXCR5+TCF1+ precursors of exhausted T cell population (Tpex). This preservation of the Tpex population's self-renewing capacity allows for sustained T cell proliferation and antiviral responses.

      The authors discovered a dual effect of IL-2 treatment: it decreases CXCR5 expression on Tpex cells, restricting their entry into the B cell follicle. This may explain why IL-2 treatment has little impact on overall viral control. However, this finding also suggests a potential application of IL-2 treatment for autoimmune diseases, as it can suppress specific immune responses within the B cell follicle. Using imaging-based approaches, the team provided direct evidence that IL-2 treatment shifts the viral load to concentrate within the B cell follicle, correlating with the observed decrease in CXCR5 expression.

      Further, the researchers showed that ectopic expression of constitutively active STAT5, downstream of IL-2 induced cytokine signaling, in P14 TCR transgenic T cells (specific for an LCMV epitope), drove the T cell population toward the CXCR5- Tex phenotype over the CXCR5+ Tpex cells in vivo. Additionally, abrogating Blimp1, upregulated by active IL-2-phosphorylated STAT5 signaling, restored the CXCR5+ Tpex population.

      Building on these results, the researchers used an engineered IL-2 fusion protein, ANV410, targeting the beta-chain of the IL-2 receptor CD122, which successfully replicated their earlier findings. Importantly, the Tpex-sustaining effect of IL-2 was only observed when treatment was administered during the late stage of infection, as early treatment suppressed Tpex cell generation. Immune profiling of SLE patients undergoing low-dose IL-2 treatment showed a similar reduction in the CXCR5+ Tpex cell population.

      This study provides compelling data on the physiological consequences of IL-2 treatment during chronic viral infection. By leveraging the chronic and persistent LCMV Docile infection model, the researchers identified the temporal effects of IL-2 fusion protein treatment, offering strategic insights for therapies targeting cancer and autoimmune diseases.

    1. eLife assessment

      This is a mechanistic study showing the effect of combining inhibition of autophagy (through ULK1/2) and KRAS (using sotorasib) on KRAS mutant NSCLC making the study valuable to cancer biologists and more broadly in a clinical setting. The evidence generated by GEM mouse models and cell lines is solid but could be further strengthened by increasing the mouse cohort size. This study holds translational relevance beyond NSCLC to other indications that carry KRAS mutations.

    2. Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Ghazi et reported that inhibition of KRASG12C signaling increases autophagy in KRASG12C expressing lung cancer cells. Moreover, the combination of DCC 3116, a selective ULK1/2 inhibitor, plus sotorasib displays cooperative/synergistic suppression of human KRASG12C driven lung cancer cell proliferation in vitro and tumor growth in vivo. Additionally, in genetically engineered mouse models of KRASG12C driven NSCLC, inhibition of either KRASG12C or ULK1/2 decreases tumor burden and increases mouse survival. Additionally, this study found that LKB1 deficiency diminishes the sensitivity of KRASG12C/LKB1Null-driven lung cancer to the combination treatment, perhaps through the emergence of mixed adeno/squamous cell carcinomas and mucinous adenocarcinomas.

      Strengths:

      Both human cancer cells and mouse models were employed in this study to illustrate that inhibiting ULK1/2 could enhance the responsiveness of KRASG12C lung cancer to sotorasib. This research holds translational importance.

      Weaknesses:

      The revised manuscript has addressed most of my previous concerns. However, I still have one issue: the sample size (n) for the GEMM study in Figures 4E and 4F is too small, despite the authors' explanation. The data do not support the conclusion due to the lack of significant difference in tumor burden. Additionally, the significance labels in Figure 4E are not clearly explained.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Given that KRAS inhibition approaches are a relatively new innovation and that resistance is now being observed to such therapies in patients with NSCLC, investigation of combination therapies is valuable. The manuscript furthers our understanding of combination therapy for KRAS mutant non-small cell lung cancer by providing evidence that combined inhibition of ULK1/2 (and therefore autophagy) and KRAS can inhibit KRAS-mutant lung cancer growth. The manuscript will be of interest to the lung cancer community but also to researchers in other cancer types where KRAS inhibition is relevant.

      Strengths:

      The manuscript combines cell line, cell line-derived xenograft, and genetically-engineered mouse model data to provide solid evidence for the proposed combination therapy.  The manuscript is well written, and experiments are broadly well performed and presented.

      We thank Reviewer #1 (R1) for the generally favorable review of our manuscript, and also for the more detailed critique that identifies potential weaknesses in the research, which we address on a point-by-point basis below. 

      Weaknesses:

      With 3-4 mice per group in many experiments, experimental power is a concern and some comparisons (e.g. mono vs combination therapy) seem to be underpowered to detect a difference. Both male and female mice are used in experiments which may increase variability.

      We thank R1 for pointing out concerns regarding statistical power in our various mouse models of NSCLC experiments, and agree that more mice per group would certainly increase statistical power.  However, there are certain logistical considerations that impact the generation of cohorts of experimental KrasLSL-G12C mice.  Because mice homozygous for the KrasLSL-G12C allele display embryonic lethality, we are required to generate experimental mice by crossing heterozygous male and female KrasLSL-G12C mice.  Although 66% of the progeny of such crosses are predicted to be KrasLSL-G12C/+, experience tells us that we only obtain ~40-50% heterozygous KrasLSL-G12C/+ mice with litter sizes around 6-8 mice from such crosses.  Therefore, there are usually only about 4 heterozygous KrasLSL-G12C mice per litter, which presents a substantial challenge in generating larger cohorts of age-matched mice suitable for experiments, especially under conditions where we wish to euthanize mice at multiple time points for analysis.  For the GEM model experiments, Figure 3B is the only experiment that has n=3.  All other experiments contain 4-6 mice per experimental condition.  We rationalized using both male and female mice because both human males and females have high lung cancer rates.

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Ghazi et reported that inhibition of KRASG12C signaling increases autophagy in KRASG12C-expressing lung cancer cells. Moreover, the combination of DCC 3116, a selective ULK1/2 inhibitor, plus sotorasib displays cooperative/synergistic suppression of human KRASG12C-driven lung cancer cell proliferation in vitro and tumor growth in vivo. Additionally, in genetically engineered mouse models of KRASG12C-driven NSCLC, inhibition of either KRASG12C or ULK1/2 decreases tumor burden and increases mouse survival. Additionally, this study found that LKB1 deficiency diminishes the sensitivity of KRASG12C/LKB1Null-driven lung cancer to the combination treatment, perhaps through the emergence of mixed adeno/squamous cell carcinomas and mucinous adenocarcinomas.

      Strengths:

      Both human cancer cells and mouse models were employed in this study to illustrate that inhibiting ULK1/2 could enhance the responsiveness of KRASG12C lung cancer to sotorasib. This research holds translational importance.

      We thank Reviewer #2 (R2) for the generally favorable review of our manuscript, and also for the more detailed critique that identifies potential weaknesses in the research, which we address on a point-by-point basis below. 

      Weaknesses:

      Additional validation of certain data is necessary.

      (1) mCherry-EGFP-LC3 reporter was used to assess autophagy flux in Figure 1A. Please explain how autophagy status (high, medium, and low) was defined. It's also suggested to show WB of LC3 processing in different treatments as in Figure 1A at 48 hours.

      We thank the reviewer for this comment and agree that a more thorough description of how autophagy status is assessed using the Fluorescent Autophagy Reporter (FAR) would benefit the readers of our manuscript.  Cells engineered to express the FAR are analyzed by flow cytometry in which we defined autophagy status by gating viable (based Sytox Blue staining), DMSO-treated control cells into three bins based on the ratio of EGFP:mCherry fluorescence.  We gate all live cells into the 33% highest EGFP-positive cells (autophagy low) and the 33% highest mCherry-positive cells (autophagy high), and therefore, the proportion in the middle is also approximately 33% and considered the medium autophagy status.  Again, these gates are based entirely on the DMSO-treated control cells, and all other treatments within the experiment are compared to settings on these gates.  In response to a specific manipulation (sotorasib, trametinib, DCC-3116 etc) we assess how the specific treatment changes the percentages of cells in each of the pre-specified gates to assess increased autophagy (decreased EGFP:mCherry ratio) or decreased autophagy (increased increased EGFP:mCherry ratio). 

      Although LC3 processing and/or the expression of p62SQSTM1 are used by others as markers of autophagy, there is much debate in the literature as to how reliable immunoblotting analysis of LC3 processing or p62SQSTM1 expression are as measures of autophagy.  Certainly, in our hands, we find that the Fluorescent Autophagy Reporter is a much more sensitive measure of changes in autophagy in various different cancer cell lines as we have described in previous papers (Kinsey et al., PMID: 30833748, Truong et al., PMID: 32933997 and Silvis & Silva et al., PMID: 36719686).  Furthermore, in the omnibus publication that describes techniques for measuring autophagy (Klionsky et al., PMID: 33634751) the use of the FAR (or similarly configured reporters) is regarded as the gold standard for measuring autophagy status in cells.  We have amended the Materials & Methods section of our manuscript to better describe the use of the FAR in measuring autophagy. 

      (2) For Figures 1J, K, and L, please provide immunohistochemistry (IHC) images demonstrating RAS downstream signaling blockade by sotorasib and autophagy blockade by DCC 3116 in tumors.

      We thank the reviewer for the comment and have probed the tumors from the xenograft experiments in Figures 1J, K, and L for pERK1/2 and p62SQSTM1 to determine the biochemical activity of sotorasib or DCC-3116, respectively and have provided representative images below. We observed the expected decrease in pERK and p62 signal after sotorasib treatment in all three xenografted cell lines. We did observe the expected accumulation of p62 in the DCC-3116 treated tumors from the NCI-H2122 and NCI-H358 cell lines. There appears to be no difference between the vehicle and DCC-3116 treated tumors in the NCI-H358 cell line-derived tumors as detected by IHC.

      Author response image 1.

      (3) Given that both DCC 3116 and ULK1K46N exhibit the ability to inhibit autophagy and synergize with sotorasib in inhibiting cell proliferation, in addition to demonstrating decreased levels of pATG13 via ELISA assay, please include Western blot analyses of LC3 or p62 to confirm the blockade of autophagy by DCC 3116 and ULK1K46N in Figure 1 & Figure 2.

      We appreciate the reviewer's comment and have performed an immunoblot analysis of cells treated with DCC-3116 or expressing ULK1K46N and probed for p62SQSTM1 and LC3 expression.  We did observe the expected accumulation of p62 SQSTM1 in NCI-H2122 (ULK1K46N) cells treated with 1ug/ml doxycycline to induce expression of ULK1K46N compared to DMSO treatment.  Additionally, we treated the human cell lines from Figure 1 with sotorasib and/or DCC-3116 and tested for p62SQSTM1 expression after 48 hours of treatment. In the human cell lines NCI-H2122 and NCI-H358, there was a decrease in the p62 signal with increasing doses of sotorasib, as expected. There was no detectable change in p62 levels in the Calu-1 cells by immunoblot. For LC3-I/LC3-II, there was only one detectable band in the NCI-H2122 cells, which makes it difficult to interpret the results and further emphasizes why we use the fluorescent autophagy reporter which is more sensitive than immunoblotting. There is no detectable change in LC3-I/LC3-II in the Calu-1 cells treated with increasing doses of sotorasib, but the expected decrease in LC3-I is observed with sotorasib treatment in the NCI-H358 cells.

      Author response image 2.

      (4) Since adenocarcinomas, adenosquamous carcinomas (ASC), and mucinous adenocarcinomas were detected in KL lung tumors, please conduct immunohistochemistry (IHC) to detect these tumors, including markers such as p63, SOX2, Katrine 5.

      We have included IHC analysis of the adenosquamous carcinomas for the markers p63, SOX2, and Keratin 5 from the KL mouse in Figure 3 and the ASC tumors in Supplemental Figure 4, and thank the reviewer for this excellent suggestion. The straining for these markers is below. Of note, we tried two different SOX2 antibodies (cell signaling technologies #14962 and cell signaling technologies # 3728) and could not detect any staining in any section.

      Author response image 3.

      (5) Please provide the sample size (n) for each treatment group in the survival study (Figure 4E). It appears that all mice were sacrificed for tumor burden analysis in Figure 4F. However, there doesn't seem to be a significant difference among the treatment groups in Figure 4F, which contrasts with the survival analysis in Figure 4E. It is suggested to increase the sample size in each treatment group to reduce variation.

      We have updated Figure 4E to indicate sample size for each treatment group and thank the reviewer for this suggestion.  Any mice that remained on study through the entire 8-week treatment regimen were sacrificed after the last day of treatment (Day 56).  Figure 4F indicates analysis of total tumor burden in all mice that remained on treatment for the full 8 weeks and mice that reached euthanasia criteria before the end of the 8-week treatment.  Therefore, it is important to note that the mice in Figure 4F were not all euthanized on the same day.  There is no statistically significant difference between the 3 treatment groups (sotorasib, DCC-3116, combination).  This may be due to a lower sample size as well as ending the treatment at 8 weeks as opposed to continuing the treatment for a longer period of time.  Although we agree that increasing the sample size would benefit the study, due to how long the GEMM model experiments take (12-16 weeks of breeding, 6 weeks for the mice to reach adulthood, 10 weeks of tumor formation post-initiation, 8 weeks of treatment= ~40 weeks) we would respectfully submit that the analysis of additional mice is outside the scope of the current revised manuscript.

      (6) In KP mice (Figure 5), it seems that a single treatment alone is sufficient to inhibit established KP lung tumor growth. Combination treatment does not further enhance anti-tumor efficacy. Therefore, this result doesn't support the conclusion generated from human cancer cell lines. Please discuss.

      We thank the reviewer for this observation.  Indeed, KP lung tumors were sensitive to single agent DCC-3116 treatment, which is reflected in the tumor burden analysis.  This was somewhat surprising to us as we have not previously detected much anti-tumor activity using 4-amino-quinoloines (chloroquine or hydroxychloroquine) or other autophagy inhibitors.  It should be noted however that the KRASG12C/TP53R175H NSCLC model has a very low tumor burden overall (~4% in vehicle-treated mice).  Additionally, our microCT imager cannot detect AAH and small tumors at the settings/resolution used.  Therefore, we were limited in our ability to detect small tumors or hyperplasia by microCT imaging.  Although there was a decrease in overall tumor burden with single agent DCC-3116 treatment, we could not demonstrate using microCT imaging that KRASG12C/TP53R175H lung tumors were actually regressing with single agent DCC-3116 treatment.  The larger tumors that were detected appeared to show a cytostatic effect (i.e. no or slow growth) with DCC-3116 monotherapy.  This may reflect our inability to detect regression of AAH or small tumors with the microCT.  In all human cell lines tested, the only cell line that responded to single agent DCC-3116 treatment was NCI-H358 cells, which do have a complete heterozygous loss of the TRP53 gene and lack TP53 protein.  However, other cells that also have a loss of expression of TP53 expression (Calu-1) are insensitive to single-agent DCC-3116 treatment. Due to the low mutational burden of the KP mouse model compared to human NSCLC cell lines driven by mutationally-activated KRASG12C and the loss of TP53 function, it is difficult to directly compare GEM models to the human cell line models.  Most of the human cell lines have alterations in other genes that are not altered in the KP mouse model which could affect the sensitivity of treatment.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Minor comments:

      (1) Figure legends are currently not adequate - information about the number and nature of replicates, stats, and definitions of the labelling used for stats should be added throughout. In Figure 5B, only two lines of four are labelled with * or ns.

      We thank the reviewer for this comment and have included more details in the figure legends that describe replicates, statistical analysis and definitions of labeling.  We also note that the methods section has a detailed description of the statistical analysis used.

      (2) What statistical test is performed on Figure 5E to get a p < 0.05 between the vehicle and DCC group?

      We performed a one-way ANOVA for all statistical analyses with more than 2 experiential groups. We thank the reviewer for pointing out this typo. These data points (vehicle vs. DCC-3116) are not statistically significant, which has been revised in the figure.

      (3) The manuscript figures would be improved by the use of a colourblind-friendly palette.

      We have previously published multiple manuscripts using this color scheme for the fluorescent autophagy reporter experiments and chose to use red and green as the reporter uses EGFP and mCherry.  We wanted to keep this color scheme consistent across our publications and would prefer not to change the colors.  However, we agree with the reviewer that the data should be accessible to all people and, therefore, have updated these graphs to include slashes over the red color to ease in telling the differences between the red and green colors.  Thank you to the reviewer for this excellent suggestion.

      (4) The manuscript should be fully checked for mouse (sentence case) and human (caps) gene (italics) and protein (non-italics).

      In this manuscript we are using the nomenclatures approved by the HUGO Gene Nomenclature Committee (https://en.wikipedia.org/wiki/HUGO_Gene_Nomenclature_Committee) in which:

      Human genes are written as KRAS, TP53 etc i.e. ITALICIZED CAPS

      Mouse genes are written as Kras, Trp53 etc:  i.e. Italicized and sentence case

      Human and mouse proteins are written as KRAS, TP53 etc:  i.e. NON-ITALICIZED CAPS

      In response to the reviewer’s suggestion, we have gone through the manuscript to check for this and make any appropriate changes.  Of note, we intentionally refer to the mouse protein changes as KRASG12C/LKB1null or KRASG12C/TP53R172H (capitalized), as this references the protein change and not the nucleotide change that occurs in the gene.

      (5) Adenosquamous is the correct term for the disease.  In parts, it's referred to as adeno/squamous or adeno-squamous.  The abbreviation ADC is also defined many times.

      Thank you to the reviewer for this comment.  We have corrected the manuscript text to only use adenosquamous and only define ADC in the first instance.

      (6) Line 434 - "as previously described" but no reference.

      Typos:

      (1) Line 117 – either

      (2) Line 314 – synergistic

      (3) Line 317 – therefore

      (4) Line 502 – medium

      We thank the reviewer for pointing out these typos and have modified the text appropriately.

      Reviewer #2 (Recommendations For The Authors):

      (1) The statement on Page 4, Lines 119-120, lacks clarity: 'Furthermore, LKB1 silencing diminishes the sensitivity of KRASG12C/LKB1Null-driven lung cancer perhaps through the emergence of mixed adeno/squamous cell carcinomas and mucinous adenocarcinomas.  It is unclear whether this refers to the sensitivity to the combination treatment or to the KRASc inhibitor alone.

      We thank the reviewer for this comment and agree that the statement lacks clarity.  The intent of this statement was to refer to both single agent sotorasib treatment as well as the combination with DCC-3116.  

      (2) Page 5 Line 147 "KRASG12X ". Please correct this typo.

      We thank the reviewer for this comment, but this is not a typo. We intended for this line to state KRASG12X to refer to cell lines with any KRASG12 alteration, e.g KRASG12D, KRASG12C, KRASG12S, KRASG12R etc.  

      (3) The color of the dots in Figure 5B labeling does not match the dots in the graph.

      For all bar graphs in the manuscript, the dots representing individual mice are black, and the bar itself is color-coded based on treatment type. The dots in Figure 5B follow this pattern and are intended to be this way.

      (4) Figure 5C depicts lung weight rather than tumor growth, contrary to the text description "regression of pre-existing lung tumors was detected by microCT scanning (Figure 5C, Figure S5)".

      Figure 5C does not depict lung weight but the percent body weight change in treated mice, described in the figure legend.  We thank the reviewer for pointing this out because we referenced the wrong panel in the text.  The figures referenced should be Figure 5B, Figure S5.  We have corrected this in the text.

    1. Reviewer #1 (Public Review):

      Summary:

      The authors profile gene expression, chromatin accessibility and chromosomal architecture (by Hi-C) in activated CD4 T cells and use this information to link non-coding variants associated with autoimmune diseases with putative target genes. They find over a 1000 genes physically linked with autoimmune disease loci in these cells, many of which are upregulated upon T cell activation. Focusing on IL2, they dissect the regulatory architecture of this locus, including the allelic effects of GWAS variants. They also intersect their variant-to-gene lists with data from CRISPR screens for genes involved in CD4 T cell activation and expression of inflammatory genes, finding enrichments for regulators. Finally, they showed that pharmacological inhibition of some of these genes impacts T cell activation.

      This is a solid study that follows a well-established canvas for variant-to-gene prioritisation using 3D genomics, applying it to activated T cells. The authors go some way in validating the lists of candidate genes, as well as explore the regulatory architecture of a candidate GWAS locus. Jointly with data from previous studies performing variant-to-gene assignment in activated CD4 T cells (and other immune cells), this work provides a useful additional resource for interpreting autoimmune disease-associated genetic variation.

      Autoimmune disease variants were already linked with genes in CD28-stimulated CD4 T cells using chromosome conformation capture, specifically Promoter CHi-C and the COGS pipeline (Javierre et al., Cell 2016; Burren et al., Genome Biol 2017; Yang et al., Nat Comms 2020). The authors cite these papers and present a comparative analysis of their variant-to-gene assignments (in addition to scRNA-seq eQTL-based assignments). Furthermore, they find that the Burren analysis yields a higher enrichment for gold standard genes.

      I thank the authors for their revisions in response to my initial review. The revised version now includes a more comprehensive comparative analysis of different datasets and V2G approaches and discusses the potential sources of differences in the results. Most significantly, the authors have now included an interesting comparison of their methodology with the popular ABC technique and outlined the key limitations of ABC relative to their method and other (Capture) Hi-C-based V2G approaches.

    2. eLife assessment

      This is a solid study that follows a well-established canvas for variant-to-gene prioritisation using 3D genomics, applying it to activated T cells. The authors go some way in validating the lists of candidate genes, as well as exploring the regulatory architecture of a candidate GWAS locus. Jointly with data from previous studies performing variant-to-gene assignment in activated CD4 T cells (and other immune cells), this work provides a useful additional resource for interpreting autoimmune disease-associated genetic variation.

    3. Reviewer #2 (Public Review):

      Summary:

      There is significant interest in characterizing the mechanisms by which genetic mutations linked to autoimmunity perturb immune processes. Pahl et al. collect information of dynamic accessible regions, genes, and 3D contacts in primary CD4+ T cell samples that have been stimulated ex vivo. The study includes a variety of analyses characterizing these dynamic changes. With TF footprinting they propose factors linked to active regulatory elements. They compare the performance of their variant mapping pipeline that uses their data versus existing datasets. Most compelling there was a deep dive into additional study of regulatory elements nearby the IL2 gene. Finally, they perform a pharmacological screen targeting several genes they suggest are involved in T cell proliferation.

      Strengths:

      - The work done characterizing elements at the IL2 locus is impressive.

      Weaknesses:

      - There are extensive studies performed on resting and activated immune cell states (CD4+ T cells and other cell types) and some at multiple time points or concentrations of stimuli that collect ATAC-seq and/or RNA-seq. Several analyses performed in published studies were similarly performed in this study. I expected the authors to at least briefly mention published studies and whether their conclusions generally agree or disagree. Are the same dynamic regulatory regions or genes identified upon T cell activation? Are the same TF footprints enriched in these dynamic regulatory elements? In the revision, I appreciate that the authors now include additional data from several studies that I had initially suggested for the purposes of nominating disease genes in their precision-recall analysis.

    4. Reviewer #3 (Public Review):

      Summary:

      This paper used RNAseq, ATACseq, and Hi-C to assess gene expression, chromatin accessibility, and chromatin physical associations for native CD4+ T cells as they respond to stimulation through TCR and CD28. With these data in hand, the author identified 423 GWAS signals to their respective target genes, where most of these were not in the proximal promoter, but rather distal enhancers. The IL-2 gene was used as an example to identify new distal cis regulatory regions required for optimal IL-2 gene transcription. These distal elements interact with the proximal IL2 promoter region. When the distal enhancer contained an autoimmune SNP, it affected IL-2 gene transcription. The authors also identified genetic risk variants that were associated to genes upon activation. Some of these regulate proliferation and cytokine production, but others were novel.

      Strengths:

      This paper provides a wealth of data related to gene expression after CD4 T cells are activated through the TCR and CD28. An important strength of this paper is that these data were intensively analyzed to uncover autoimmune disease SNPs in cis acting regions. Many of these could be assigned to likely target genes even though they often are in distal enhancers. These findings help to provide a better understanding concerning the mechanism by which GWAS risk elements impact gene expression.

      Another strength to this study was the proof-of-principle studies examining the IL-2 gene. Not only were new cis acting enhancers discovered, but they were functionally shown to be important in regulating IL-2 expression, including susceptibility to colitis. Their importance was also established with respect to such distal enhancers harboring disease relevant SNPs, which were shown to affect IL-2 transcription.

      The data from this study were also mined against past Crispr screens that identified genes that control aspects of CD4 T cell activation. From these comparisons, novel genes were identified that function during T cell activation.

      Weaknesses:

      A weakness from this study is that few individuals were analyzed, i.e., RNAseq and ATACseq (n=3) and HiC (n=2). Thus, the authors may have underestimated potentially relevant risk associations by their chromatin capture-based methodology. This might account for low overlap of their data with the eQTL-based approach or the HIEI truth set.

      The authors explain that the low overlap is not due to few GWAS associations by HiC. The expanded discussion in the revised manuscript provides a framework to help explain inherent differences between these methods that may contribute to the low overlap.

      Impact:

      This study indicates that defining distal chromatin interacting regions help to identify distal genetic elements, including relevant variants, that contribute to gene activation.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary:

      The authors profile gene expression, chromatin accessibility, and chromosomal architecture (by Hi-C) in activated CD4 T cells and use this information to link non-coding variants associated with autoimmune diseases with putative target genes. They find over 1000 genes physically linked with autoimmune disease loci in these cells, many of which are upregulated upon T cell activation. Focusing on IL2, they dissect the regulatory architecture of this locus, including the allelic effects of GWAS variants. They also intersect their variant-to-gene lists with data from CRISPR screens for genes involved in CD4 T cell activation and expression of inflammatory genes, finding enrichments for regulators. Finally, they showed that pharmacological inhibition of some of these genes impacts T-cell activation. 

      This is a solid study that follows a well-established canvas for variant-to-gene prioritisation using 3D genomics, applying it to activated T cells. The authors go some way in validating the lists of candidate genes, as well as exploring the regulatory architecture of a candidate GWAS locus. Jointly with data from previous studies performing variant-to-gene assignment in activated CD4 T cells (and other immune cells), this work provides a useful additional resource for interpreting autoimmune disease-associated genetic variation. 

      Suggestions for improvement:

      Autoimmune disease variants were already linked with genes in CD28-stimulated CD4 T cells using chromosome conformation capture, specifically Promoter CHi-C and the COGS pipeline (Javierre et al., Cell 2016; Burren et al., Genome Biol 2017; Yang et al., Nat Comms 2020). The authors cite these papers and present a comparative analysis of their variant-to-gene assignments (in addition to scRNA-seq eQTL-based assignments). Furthermore, they find that the Burren analysis yields a higher enrichment for gold standard genes. 

      The obvious question that the authors don't venture into is why the results are quite different. In principle, this could be due to the differences between: 

      (a) the cell stimulation procedure 

      (b) the GWAS datasets used 

      (c)  the types of assay (Hi-C vs Capture Hi-C) 

      (d) approaches for defining gene-linked regions (loops vs neighbourhoods) 

      (e) how the GWAS signals at gene-linked regions are aggregated (e.g., the flavours of COGS in Javierre and Burren vs the authors' approach)

      Re (a), I'm not sure the authors make it explicitly clear in the main text that the Capture Hi-Cbased studies also use *stimulated* CD4 T cells, particularly in the section "Comparative predictive power...". So the cells used are pretty much the same, and the differences likely arise from points (b) to (e).

      It would be useful for the community to understand more clearly what is driving these differences, ideally with some added data. Could the authors, for example, take the PCHi-C data from Javierre/Burren and use their GWAS data and variant-to-gene assignment algorithms? 

      We greatly appreciate the referee’s expert assessment of our work and its value to the field, and we are glad that the referee was enthused by our comparison of the predictive power of the various V2G approaches. A point not emphasized enough in the original version of the manuscript is that we actually did harmonize the various datasets in the way the referee suggests for the precision/recall analysis. We took the contact maps presented from each paper, mapped genes using the same set of GWAS SNPs, and defined all gene-linked regions using our loop calling approach. This has been clarified in the revised version of the manuscript. We have now included a more thoughtful discussion of the possible sources of discrepancy between the different studies included in the comparison, and our thoughts on the potential sources raised by the referee are outlined below:

      (a) The modes of stimulation used are similar between studies, but timepoints and donors did vary, and ours was the only study that sorted naïve CD4+ T cells before stimulation. These aspects could represent a source of variability. 

      (b) The GWAS is not a source of variability because we re-ran the raw data from all the orthogonal studies through our V2G pipeline using the same GWAS as in the current manuscript. 

      (c) The use of HiC vs. Capture HiC is a likely source of variability. The Capture-HiC datasets included in our comparison are lower resolution (i.e. HindIII) but focus higher sequencing depth at promoters compared to our HiC datasets – i.e., Capture-HiC may mis-call loops to the wrong promoters due to lower resolution as we have shown in our previous study [Su, Human Genetics, 2021], and will miss distal SNP interactions at promoters not included in the capture set. While HiC is unbiased in this regard, HiC will fail to call some SNP-promoter loops called by CaptureHiC because the sequencing power is not specifically focused at promoters. 

      (d) For studies using neighborhood approaches, we re-ran the raw data through our loop calling algorithm to connect distal SNP to gene promoters, and regarding (e) above, we ran the raw data through our V2G pipeline to allow a better comparison.

      In addition, given that the authors use Hi-C, a popular method for V2G prioritisation for this type of data is currently ABC (Nasser et al, Nature 2021). Could the authors provide a comparative analysis with respect to the V2G assignments in the paper and, if they see it appropriate, also run ABC-based GWAS integration on their own Hi-C data?

      This is an excellent suggestion, which we have followed in the revised version of our manuscript. It should be noted (and we do so in the text of the revision) that there is an important caveat to bringing in the ABC model. Chromosome conformation-based approaches are biologically constrained (i.e., informed) by the natural structure of chromatin in the nucleus that controls how gene transcription is regulated in cis, and it does so in a way that brings value to GWAS data. However, the ABC model further constrains the input data by imposing non-biological filters that allow the algorithm to be applied, but impose artifactual limitations that may negatively impact interpretation and discovery. In addition to filtering out pseudogenes, bidirectional RNA, antisense RNAs, and small RNAs, the ABC model gene set eliminates genes ubiquitously expressed across tissues (based on the assumption that these genes are driven primarily by elements adjacent to their promoters) and only allows annotation of one promoter per gene, even though the median number of promoters per gene in the human genome is three. In contrast, our chromatin-based V2G removes pseudogenes, but includes lincRNA and small RNAs, and includes all alternative transcription start sites annotated by gencode. 

      To apply the ABC GWAS gene nomination model to our CD4+ T cell chromatin-based V2G data, we used our ATAC-seq data and publicly available CD4+ T cell H3K27ac ChIP-seq data as input, and integrated this with GWAS and the average ENCODE-derived HiC dataset from the original ABC paper. The activity-by-contact model nominated 650 genes, compared to 1836 genes when using our cell type-matched HiC data and analysis pipeline. Only 357 of these genes were nominated by both approaches; 1479 genes nominated by our approach were not nominated by ABC, while 293 genes not implicated by our approach were newly implicated by ABC. To determine how the ABC-constrained approach performs against the HIEI gold standard set, we subjected all datasets used for the comparison depicted in the new Figure 5D to the same promoter filter used by the ABC model prior as part of the precision-recall re-analysis. Firstly, we found that applying the restricted ABC model promoter annotation to all datasets did not have a large effect on recall, however, the precision of several of the datasets were affected. For example, using the restricted promoter set reduced the precision of our (Pahl) V2G approach and inflated the precision of the nearest gene to SNP metric. Second, the new precision-recall analysis shows that the ABC score-based approach is only half as sensitive at predicting HIEI genes as the chromatin-based V2G approaches. This indicates that constraining GWAS data with cell type- and state-specific 3D chromatin-based data brings more GWAS target gene predictive power than application of the multi-tissue-averaged HiC used by the ABC model. We thank the reviewer for helpful suggestions that have improved the quality of our study.

      Reviewer #2 (Public Review): 

      Summary:

      There is significant interest in characterizing the mechanisms by which genetic mutations linked to autoimmunity perturb immune processes. Pahl et al. collect information on dynamic accessible regions, genes, and 3D contacts in primary CD4+ T cell samples that have been stimulated ex vivo. The study includes a variety of analyses characterizing these dynamic changes. With TF footprinting they propose factors linked to active regulatory elements. They compare the performance of their variant mapping pipeline that uses their data versus existing datasets. Most compelling there was a deep dive into additional study of regulatory elements nearby the IL2 gene. Finally, they perform a pharmacological screen targeting several genes they suggest are involved in T cell proliferation. 

      Strengths:

      The work done characterizing elements at the IL2 locus is impressive. 

      Weaknesses:

      Missing critical context to evaluate claims. There are extensive studies performed on resting and activated immune cell states (CD4+ T cells and other cell types) and some at multiple time points or concentrations of stimuli that collect ATAC-seq and/or RNA-seq that have been ignored by this study. How do conclusions from previous studies compare to what the authors conclude here? It is impossible to evaluate the claims without this additional context. These are a few studies I am familiar with (the authors should perform a more comprehensive search to be sure they're not ignoring existing observations) that would be important to compare/contrast conclusions:  o Alasoo, K. et al. Shared genetic effects on chromatin and gene expression indicate a role for enhancer priming in immune response. Nat. Genet. 50, 424-431 (2018). 

      - Calderon, D., Nguyen, M.L.T., Mezger, A. et al. Landscape of stimulation-responsive chromatin across diverse human immune cells. Nat Genet 51, 1494-1505 (2019). 

      - Gate, R.E., Cheng, C.S., Aiden, A.P. et al. Genetic determinants of co-accessible chromatin regions in activated T cells across humans. Nat Genet 50, 1140-1150 (2018).  o Glinos, D.A., Soskic, B., Williams, C. et al. Genomic profiling of T-cell activation suggests increased sensitivity of memory T cells to CD28 costimulation. Genes Immun 21, 390-408 (2020).  o Gutierrez-Arcelus, M., Baglaenko, Y., Arora, J. et al. Allele-specific expression changes dynamically during T cell activation in HLA and other autoimmune loci. Nat Genet 52, 247-253 (2020). 

      - Kim-Hellmuth, S. et al. Genetic regulatory effects modified by immune activation contribute to autoimmune disease associations. Nat. Commun. 8, 266 (2017).  o Ye, C. J. et al. Intersection of population variation and autoimmunity genetics in human T cell activation. Science 345, 1254665 (2014). 

      - As a general point, I appreciate it when each claim includes a corresponding effect size and p-value, which helps me evaluate the strength of significance of supporting evidence. 

      We greatly appreciate the referee’s expert assessment of our work and emphasis on the value of our functional follow-up studies. Our precision-recall analyses were not meant to represent an exhaustive comparison of all prior GWAS gene nomination studies, although we agree that this could (and should) be done as part of a separate study in a future manuscript. Instead, we focused on gene nomination studies that 1) analyzed resting and activated human CD4+ T cells, 2) whose experimental design was most comparable to our own studies, and 3) had raw data readily available in the appropriate formats to allow re-analysis and harmonization before comparison. This is a point we did not make sufficiently clear in the original version of the manuscript, but have clarified in the revision. 

      Based on this rationale, we agree that the studies by Gate et al. and Ye et al. should be included in our comparative precision-recall analysis, and we have done so in the revised manuscript. The Gate study reported ATAC-seq peak co-accessibility, caQTL, eQTL, and HiC data, and we now include the resulting gene nominations from these datasets in the precision-recall analysis. These datasets performed poorly with respect to nomination of HIEI genes, likely due to small sample numbers and low sequencing depth compared to the other eQTL and chromatin capture-based studies. The eQTL reported by Ye et al. nominated 15 genes for autoimmune traits, two of which were in the ‘truth’ HIEI set (IL7R and IL2RB). This resulted low predictive power but a high precision due to the low number of nominated genes compared to the other V2G datasets. As suggested by referee 1, we have also subjected our data to the ‘activity-by-contact’ (ABC) algorithm and have included this dataset in the comparison as well. Please see Figure 5 in the revised manuscript. 

      We have elected not to include data from the other studies suggested by the referee for the following reasons: The stimulation paradigm used in the Glinos study is very different from that used in other studies. Also, this study and the study by Calderon did not nominate genes. The studies by Alasoo et al. and Kim-Hellmuth et al. analyzed macrophages, which are not a comparable cell type to CD4+ T cells. The allele-specific eQTL study by Gutierrez-Arcelus et al. included relevant the cell type and activation states, but included a relatively small number of samples (24) and variants (561), and the raw data in dbGAP does not readily allow for re-analysis and harmonization with the other studies. We thank the reviewer for helpful suggestions that have improved the quality of our study.

      Reviewer #3 (Public Review): 

      Summary:

      This paper used RNAseq, ATACseq, and Hi-C to assess gene expression, chromatin accessibility, and chromatin physical associations for native CD4+ T cells as they respond to stimulation through TCR and CD28. With these data in hand, the authors identified 423 GWAS signals to their respective target genes, where most of these were not in the proximal promoter, but rather distal enhancers. The IL-2 gene was used as an example to identify new distal cisregulatory regions required for optimal IL-2 gene transcription. These distal elements interact with the proximal IL2 promoter region. When the distal enhancer contained an autoimmune SNP, it affected IL-2 gene transcription. The authors also identified genetic risk variants that were associated with genes upon activation. Some of these regulate proliferation and cytokine production, but others are novel. 

      Strengths:

      This paper provides a wealth of data related to gene expression after CD4 T cells are activated through the TCR and CD28. An important strength of this paper is that these data were intensively analyzed to uncover autoimmune disease SNPs in cis-acting regions. Many of these could be assigned to likely target genes even though they often are in distal enhancers. These findings help to provide a better understanding concerning the mechanism by which GWAS risk elements impact gene expression. 

      Another strength of this study was the proof-of-principle studies examining the IL-2 gene. Not only were new cis-acting enhancers discovered, but they were functionally shown to be important in regulating IL-2 expression, including susceptibility to colitis. Their importance was also established with respect to such distal enhancers harboring disease-relevant SNPs, which were shown to affect IL-2 transcription. 

      The data from this study were also mined against past CRISPR screens that identified genes that control aspects of CD4 T cell activation. From these comparisons, novel genes were identified that function during T cell activation. 

      Weaknesses:

      A weakness of this study is that few individuals were analyzed, i.e., RNAseq and ATACseq (n=3) and HiC (n=2). Thus, the authors may have underestimated potentially relevant risk associations by their chromatin capture-based methodology. This might account for the low overlap of their data with the eQTL-based approach or the HIEI truth set. 

      Impact:

      This study indicates that defining distal chromatin interacting regions helps to identify distal genetic elements, including relevant variants, that contribute to gene activation. 

      We greatly appreciate the referee’s expert assessment of our work and emphasis on the value of our functional follow-up studies. We have ensured that all sample sizes, effect sizes, p values and FDR statistics are included in the figures and figure legends. We agree that including more donors for the HiC studies would increase the number of implicated variants and genes, however, all the chromatin-based V2G approaches described in our manuscript use relatively small sample sizes, but implicate more variants and genes than the comparable eQTL studies. I.e., the low overlap is not driven by a paucity of GWAS-chromatin-based associations. An alternative explanation for the low overlap between GWAS-chromatin-based approaches and eQTL approaches was recently by Pritchard and colleagues, who reported that GWAS and eQTL studies systematically implicate different types of variants (Mostafavi et al., Nature Genetics 2023). Among other differences, eQTL tend to implicate nearby genes while GWAS variants implicate distant genes, and our results support this contention. We referred to this study in the original version of the manuscript, but have included a more extensive discussion of potential explanations in the revised version. We thank the reviewer for helpful suggestions that have improved the quality of our study.

    1. eLife assessment

      This is a useful manuscript describing the competitive binding between Parkin domains to define the importance of dimerization in the mechanism of Parkin regulation and catalytic activity. The evidence supporting the importance of Parkin dimerization for an 'in trans' model of Parkin activity described in this manuscript is solid, but lacks more stringent and biochemical characterization of competitive binding that could provide more direct evidence to support the author's conclusions. This work will be of interest to those focused on defining the molecular mechanisms involved in ubiquitin ligase interactions, PINK-Parkin-mediated mitophagy, and mitochondrial organellar quality control.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors used structural and biophysical methods to provide insight into Parkin regulation. The breadth of data supporting their findings was impressive and generally well-orchestrated.

      Strengths:

      (1) They have done a better job explaining the rationale for their experiments thought-out.

      (2) The use of molecular scissors in their construct represents a creative approach to examine inter-domain interactions. Appropriate controls were included.

      (3) From my assessment, the experiments are well-conceived and executed.

      (4) The authors do a better job of highlighting the question being addressed experimentally.

    3. Reviewer #2 (Public Review):

      In the revised manuscript, the authors tried to address some of my comments from the previous round of review. Notably, they have performed some additional ITC experiments where protein precipitation is not an issue to probe interactions between PARKIN and different domains. In addition, they have toned down some of the language in the text to better reflect their data and results. However, I still feel that the manuscript lacks some key answers regarding the relative interactions between p-PARKIN and different domains, as discussed in my previous review. A deeper dive into the underlying biophysical and biochemical features that drive these interactions is important to fully understand the importance of their work. However, this manuscript does provide some interesting potential insights into the mechanisms of PARKIN activation that could be useful for the field moving forward.

    4. Reviewer #3 (Public Review):

      Summary:

      In their manuscript, Lenka et al present data that could suggest an "in trans" model of Parkin ubiquitination activity. Parkin is an intensely studied E3 ligase implicated in mitophagy, whereby missense mutations to the PARK2 gene are known to cause autosomal recessive juvenile parkinsonism. From a mechanistic point of view, Parkin is extremely complex. Its activity is tightly controlled by several modes of auto-inhibition that must be released by queues of mitochondrial damage. While the general overview of Parkin activation has been mapped out in recent years, several details have remained murky. In particular, whether Parkin dimerizes as part of its feed-forward signaling mechanism, and whether said dimerization can facilitate ligase activation, has remained unclear. Here, Lenka et al. use various truncation mutants of Parkin in an attempt to understand the likelihood of dimerization (in support of an "in trans" model for catalysis).

      Strengths:

      The results are bolstered by several distinct approaches including analytical SEC with cleavable Parkin constructs, ITC interaction studies, ubiquitination assays, protein crystallography, and cellular localization studies.

      Weaknesses:

      As presented, however, the storyline is very confusing to follow and several lines of experimentation felt like distractions from the primary message. Furthermore, many experiments could only indirectly support the author's conclusions, and therefore the final picture of what new features can be firmly added to the model of Parkin activation and function is unclear.

      Following peer review and revision, the claims are still not fully supported by direct evidence. While the experimental system may be necessary and/or convenient given the unique challenges in studying Parkin, it does not directly speak toward the conclusions that the authors make, nor does it provide an accurate representation of biology.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The authors used structural and biophysical methods to provide insight into Parkin regulation. The breadth of data supporting their findings was impressive and generally well-orchestrated. Still, the impact of their results builds on recent structural studies and the stated impact is based on these prior works.

      Strengths:

      (1) After reading through the paper, the major findings are:

      - RING2 and pUbl compete for binding to RING0.

      - Parkin can dimerize.

      - ACT plays an important role in enzyme kinetics.

      (2) The use of molecular scissors in their construct represents a creative approach to examining inter-domain interactions.

      (3) From my assessment, the experiments are well-conceived and executed.

      We thank the reviewer for their positive remark and extremely helpful suggestions.

      Weaknesses:

      The manuscript, as written, is NOT for a general audience. Admittedly, I am not an expert on Parkin structure and function, but I had to do a lot of homework to try to understand the underlying rationale and impact. This reflects, I think, that the work generally represents an incremental advance on recent structural findings.

      To this point, it is hard to understand the impact of this work without more information highlighting the novelty. There are several structures of Parkin in various auto-inhibited states, and it was hard to delineate how this is different.

      For the sake of the general audience, we have included all the details of Parkin structures and conformations seen (Extended Fig. 1). The structures in the present study are to validate the biophysical/biochemical experiments, highlighting key findings. For example, we solved the phospho-Parkin (complex with pUb) structure after treatment with 3C protease (Fig. 2C), which washes off the pUbl-linker, as shown in Fig 2B. The structure of the pUbl-linker depleted phospho-Parkin-pUb complex showed that RING2 returned to the closed state (Fig. 2C), which is confirmation of the SEC assay in Fig. 2B. Similarly, the structure of the pUbl-linker depleted phospho-Parkin R163D/K211N-pUb complex (Fig. 3C), was done to validate the SEC data showing displacement of pUbl-linker is independent of pUbl interaction with the basic patch on RING0 (Fig. 3B). In addition, the latter structure also revealed a new donor ubiquitin binding pocket in the linker (connecting REP and RING2) region of Parkin (Fig. 9). Similarly, trans-complex structure of phospho-Parkin (Fig. 4D) was done to validate the biophysical data (Fig. 4A-C, Fig. 5A-D) showing trans-complex between phospho-Parkin and native Parkin. The latter also confirmed that the trans-complex was mediated by interactions between pUbl and the basic patch on RING0 (Fig. 4D). Furthermore, we noticed that the ACT region was disordered in the trans-complex between phospho-Parkin (1-140 + 141-382 + pUb) (Fig. 8A) which had ACT from the trans molecule, indicating ACT might be present in the cis molecule. The latter was validated from the structure of trans-complex between phospho-Parkin with cis ACT (1-76 + 77-382 + pUb) (Fig. 8C), showing the ordered ACT region. The structural finding was further validated by biochemical assays (Fig. 8 D-F, Extended Data Fig. 9C-E).

      The structure of TEV-treated R0RBR (TEV) (Extended Data Fig. 4C) was done to ensure that the inclusion of TEV and treatment with TEV protease did not perturb Parkin folding, an important control for our biophysical experiments.

      As noted, I appreciated the use of protease sites in the fusion protein construct. It is unclear how the loop region might affect the protein structure and function. The authors worked to demonstrate that this did not introduce artifacts, but the biological context is missing.

      We thank the reviewer for appreciating the use of protease sites in the fusion protein construct.  Protease sites were used to overcome the competing mode of binding that makes interactions very transient and beyond the detection limit of methods such as ITC or SEC. While these interactions are quite transient in nature, they could still be useful for the activation of various Parkin isoforms that lack either the Ubl domain or RING2 domain (Extended Data Fig. 6, Fig. 10). Also, our Parkin localization assays also suggest an important role of these interactions in the recruitment of Parkin molecules to the damaged mitochondria (Fig. 6).

      While it is likely that the binding is competitive between the Ubl and RING2 domains, the data is not quantitative. Is it known whether the folding of the distinct domains is independent? Or are there interactions that alter folding? It seems plausible that conformational rearrangements may invoke an orientation of domains that would be incompatible. The biological context for the importance of this interaction was not clear to me.

      This is a great point. In the revised manuscript, we have included quantitative data between phospho-Parkin and untethered ∆Ubl-Parkin (TEV) (Fig. 5B) showing similar interactions using phospho-Parkin K211N and untethered ∆Ubl-Parkin (TEV) (Fig. 4B). Folding of Ubl domain or various combinations of RING domains lacking Ubl seems okay. Also, folding of the RING2 domain on its own appears to be fine. However, human Parkin lacking the RING2 domain seems to have some folding issues, majorly due to exposure of hydrophobic pocket on RING0, also suggested by previous efforts (Gladkova et al.ref. 24, Sauve et al. ref. 29).  The latter could be overcome by co-expression of RING2 lacking Parkin construct with PINK1 (Sauve et al. ref. 29) as phospho-Ubl binds on the same hydrophobic pocket on RING0 where RING2 binds. A drastic reduction in the melting temperature of phospho-Parkin (Gladkova et al.ref. 24), very likely due to exposure of hydrophobic surface between RING0 and RING2, correlates with the folding issues of RING0 exposed human Parkin constructs.

      From the biological context, the competing nature between phospho-Ubl and RING2 domains could block the non-specific interaction of phosphorylated-ubiquitin-like proteins (phospho-Ub or phospho-NEDD8) with RING0 (Lenka et al. ref. 33), during Parkin activation. 

      (5) What is the rationale for mutating Lys211 to Asn? Were other mutations tried? Glu? Ala? Just missing the rationale. I think this may have been identified previously in the field, but not clear what this mutation represents biologically.

      Lys211Asn is a Parkinson’s disease mutation; therefore, we decided to use the same mutation for biophysical studies.  

      I was confused about how the phospho-proteins were generated. After looking through the methods, there appear to be phosphorylation experiments, but it is unclear what the efficiency was for each protein (i.e. what % gets modified). In the text, the authors refer to phospho-Parkin (T270R, C431A), but not clear how these mutations might influence this process. I gather that these are catalytically inactive, but it is unclear to me how this is catalyzing the ubiquitination in the assay.

      This is an excellent question. Because different phosphorylation statuses would affect the analysis, we ensured complete phosphorylation status using Phos-Tag SDS-PAGE, as shown below.

      Author response image 1.

      Our biophysical experiments in Fig. 5C show that trans complex formation is mediated by interactions between the basic patch (comprising K161, R163, K211) on RING0 and phospho-Ubl domain in trans. These interactions result in the displacement of RING2 (Fig. 5C). Parkin activation is mediated by displacement of RING2 and exposure of catalytic C431 on RING2. While phospho-Parkin T270R/C431A is catalytically dead, the phospho-Ubl domain of phospho-Parkin T270R/C431would bind to the basic patch on RING0 of WT-Parkin resulting in activation of WT-Parkin as shown in Fig. 5E. A schematic figure is shown below to explain the same.

      Author response image 2.

      (7) The authors note that "ACT can be complemented in trans; however, it is more efficient in cis", but it is unclear whether both would be important or if the favored interaction is dominant in a biological context.

      First, this is an excellent question about the biological context of ACT and needs further exploration. While due to the flexible nature of ACT, it can be complemented both in cis and trans, we can only speculate cis interactions between ACT and RING0 could be more relevant from the biological context as during protein synthesis and folding, ACT would be translated before RING2, and thus ACT would occupy the small hydrophobic patch on RING0 in cis. Unpublished data shows the replacement of the ACT region by Biogen compounds to activate Parkin (https://doi.org/10.21203/rs.3.rs-4119143/v1). The latter finding further suggests the flexibility in this region.        

      (8) The authors repeatedly note that this study could aid in the development of small-molecule regulators against Parkin to treat PD, but this is a long way off. And it is not clear from their manuscript how this would be achieved. As stated, this is conjecture.

      As suggested by this reviewer, we have removed this point in the revised manuscript.

      Reviewer #2 (Public Review):

      This manuscript uses biochemistry and X-ray crystallography to further probe the molecular mechanism of Parkin regulation and activation. Using a construct that incorporates cleavage sites between different Parkin domains to increase the local concentration of specific domains (i.e., molecular scissors), the authors suggest that competitive binding between the p-Ubl and RING2 domains for the RING0 domain regulates Parkin activity. Further, they demonstrate that this competition can occur in trans, with a p-Ubl domain of one Parkin molecule binding the RING0 domain of a second monomer, thus activating the catalytic RING1 domain. In addition, they suggest that the ACT domain can similarly bind and activate Parkin in trans, albeit at a lower efficiency than that observed for p-Ubl. The authors also suggest from crystal structure analysis and some biochemical experiments that the linker region between RING2 and repressor elements interacts with the donor ubiquitin to enhance Parkin activity.<br /> Ultimately this manuscript challenges previous work suggesting that the p-Ubl domain does not bind to the Parkin core in the mechanism of Parkin activation. The use of the 'molecular scissors' approach to probe these effects is an interesting approach to probe this type of competitive binding. However, there are issues with the experimental approach manuscript that detract from the overall quality and potential impact of the work.

      We thank the reviewer for their positive remark and constructive suggestions.

      The competitive binding between p-Ubl and RING2 domains for the Parkin core could have been better defined using biophysical and biochemical approaches that explicitly define the relative affinities that dictate these interactions. A better understanding of these affinities could provide more insight into the relative bindings of these domains, especially as it relates to the in trans interactions.

      This is an excellent point regarding the relative affinities of pUbl and RING2 for the Parkin core (lacking Ubl and RING2). While we could purify p-Ubl, we failed to purify human Parkin (lacking RING2 and phospho-Ubl). The latter folding issues were likely due to the exposure of a highly hydrophobic surface on RING0 (as shown below) in the absence of pUbl and RING2 in the R0RB construct. Also, RING2 with an exposed hydrophobic surface would be prone to folding issues, which is not suitable for affinity measurements. A drastic reduction in the melting temperature of phospho-Parkin (Gladkova et al.ref. 24) also highlights the importance of a hydrophobic surface between RING0 and RING2 on Parkin folding/stability. A separate study would be required to try these Parkin constructs from different species and ensure proper folding before using them for affinity measurements.

      Author response image 3.

      I also have concerns about the results of using molecular scissors to 'increase local concentrations' and allow for binding to be observed. These experiments are done primarily using proteolytic cleavage of different domains followed by size exclusion chromatography. ITC experiments suggest that the binding constants for these interactions are in the µM range, although these experiments are problematic as the authors indicate in the text that protein precipitation was observed during these experiments. This type of binding could easily be measured in other assays. My issue relates to the ability of a protein complex (comprising the core and cleaved domains) with a Kd of 1 µM to be maintained in an SEC experiment. The off-rates for these complexes must be exceeding slow, which doesn't really correspond to the low µM binding constants discussed in the text. How do the authors explain this? What is driving the Koff to levels sufficiently slow to prevent dissociation by SEC? Considering that the authors are challenging previous work describing the lack of binding between the p-Ubl domain and the core, these issues should be better resolved in this current manuscript. Further, it's important to have a more detailed understanding of relative affinities when considering the functional implications of this competition in the context of full-length Parkin. Similar comments could be made about the ACT experiments described in the text.

      This is a great point. In the revised manuscript, we repeated ITC measurements in a different buffer system, which gave nice ITC data. In the revised manuscript, we have also performed ITC measurements using native phospho-Parkin. Phospho-Parkin and untethered ∆Ubl-Parkin (TEV) (Fig. 5B) show similar affinities as seen between phospho-Parkin K211N and untethered ∆Ubl-Parkin (TEV) (Fig. 4B). However, Kd values were consistent in the range of 1.0 ± 0.4 µM which could not address the reviewer’s point regarding slow off-rate. The crystal structure of the trans-complex of phospho-Parkin shows several hydrophobic and ionic interactions between p-Ubl and Parkin core, suggesting a strong interaction and, thus, justifying the co-elution on SEC. Additionally, ITC measurements between E2-Ub and P-Parkin-pUb show similar affinity (Kd = 0.9 ± 0.2 µM) (Kumar et al., 2015, EMBO J.), and yet they co-elute on SEC (Kumar et al., 2015, EMBO J.).

      Ultimately, this work does suggest additional insights into the mechanism of Parkin activation that could contribute to the field. There is a lot of information included in this manuscript, giving it breadth, albeit at the cost of depth for the study of specific interactions. Further, I felt that the authors oversold some of their data in the text, and I'd recommend being a bit more careful when claiming an experiment 'confirms' a specific model. In many cases, there are other models that could explain similar results. For example, in Figure 1C, the authors state that their crystal structure 'confirms' that "RING2 is transiently displaced from the RING0 domain and returns to its original position after washing off the p-Ubl linker". However, it isn't clear to me that RING2 ever dissociated when prepared this way. While there are issues with the work that I feel should be further addressed with additional experiments, there are interesting mechanistic details suggested by this work that could improve our understanding of Parkin activation. However, the full impact of this work won't be fully appreciated until there is a more thorough understanding of the regulation and competitive binding between p-Ubl and RIGN2 to RORB both in cis and in trans.

      We thank the reviewer for their positive comment. In the revised manuscript, we have included the reviewer’s suggestion. The conformational changes in phospho-Parkin were established from the SEC assay (Fig. 2A and Fig. 2B), which show displacement/association of phospho-Ubl or RING2 after treatment of phospho-Parkin with 3C and TEV, respectively. For crystallization, we first phosphorylated Parkin, where RING2 is displaced due to phospho-Ubl (as shown in SEC), followed by treatment with 3C protease, which led to pUbl wash-off. The Parkin core separated from phospho-Ubl on SEC was used for crystallization and structure determination in Fig. 2C, where RING2 returned to the RING0 pocket, which confirms SEC data (Fig. 2B).

      Reviewer #3 (Public Review):

      Summary:

      In their manuscript "Additional feedforward mechanism of Parkin activation via binding of phospho-UBL and RING0 in trans", Lenka et al present data that could suggest an "in trans" model of Parkin ubiquitination activity. Parkin is an intensely studied E3 ligase implicated in mitophagy, whereby missense mutations to the PARK2 gene are known to cause autosomal recessive juvenile parkinsonism. From a mechanistic point of view, Parkin is extremely complex. Its activity is tightly controlled by several modes of auto-inhibition that must be released by queues of mitochondrial damage. While the general overview of Parkin activation has been mapped out in recent years, several details have remained murky. In particular, whether Parkin dimerizes as part of its feed-forward signaling mechanism, and whether said dimerization can facilitate ligase activation, has remained unclear. Here, Lenka et al. use various truncation mutants of Parkin in an attempt to understand the likelihood of dimerization (in support of an "in trans" model for catalysis).

      Strengths:

      The results are bolstered by several distinct approaches including analytical SEC with cleavable Parkin constructs, ITC interaction studies, ubiquitination assays, protein crystallography, and cellular localization studies.

      We thank the reviewer for their positive remark.

      Weaknesses:

      As presented, however, the storyline is very confusing to follow and several lines of experimentation felt like distractions from the primary message. Furthermore, many experiments could only indirectly support the author's conclusions, and therefore the final picture of what new features can be firmly added to the model of Parkin activation and function is unclear.

      We thank the reviewer for their constructive criticism, which has helped us to improve the quality of this manuscript.

      Major concerns:

      (1) This manuscript solves numerous crystal structures of various Parkin components to help support their idea of in trans transfer. The way these structures are presented more resemble models and it is unclear from the figures that these are new complexes solved in this work, and what new insights can be gleaned from them.

      The structures in the present study are to validate the biophysical/biochemical experiments highlighting key findings. For example, we solved the phospho-Parkin (complex with pUb) structure after treatment with 3C protease (Fig. 2C), which washes off the pUbl-linker, as shown in Fig. 2B. The structure of pUbl-linker depleted phospho-Parkin-pUb complex showed that RING2 returned to the closed state (Fig. 2C), which is confirmation of the SEC assay in Fig. 2B. Similarly, the structure of the pUbl-linker depleted phospho-Parkin R163D/K211N-pUb complex (Fig. 3C), was done to validate the SEC data showing displacement of pUbl-linker is independent of pUbl interaction with the basic patch on RING0 (Fig. 3B). In addition, the latter structure also revealed a new donor ubiquitin binding pocket in the linker (connecting REP and RING2) region of Parkin (Fig. 9). Similarly, trans-complex structure of phospho-Parkin (Fig. 4D) was done to validate the biophysical data (Fig. 4A-C, Fig. 5A-D) showing trans-complex between phospho-Parkin and native Parkin. The latter also confirmed that the trans-complex was mediated by interactions between pUbl and the basic patch on RING0 (Fig. 4D). Furthermore, we noticed that the ACT region was disordered in the trans-complex between phospho-Parkin (1-140 + 141-382 + pUb) (Fig. 8A) which had ACT from the trans molecule, indicating ACT might be present in the cis molecule. The latter was validated from the structure of trans-complex between phospho-Parkin with cis ACT (1-76 + 77-382 + pUb) (Fig. 8C), showing the ordered ACT region. The structural finding was further validated by biochemical assays (Fig. 8 D-F, Extended Data Fig. 9C-E).

      The structure of TEV-treated R0RBR (TEV) (Extended Data Fig. 4C) was done to ensure that the inclusion of TEV and treatment with TEV protease did not perturb Parkin folding, an important control for our biophysical experiments.

      (2) There are no experiments that definitively show the in trans activation of Parkin. The binding experiments and size exclusion chromatography are a good start, but the way these experiments are performed, they'd be better suited as support for a stronger experiment showing Parkin dimerization. In addition, the rationale for an in trans activation model is not convincingly explained until the concept of Parkin isoforms is introduced in the Discussion. The authors should consider expanding this concept into other parts of the manuscript.

      We thank the reviewer for appreciating the Parkin dimerization. Our biophysical data in Fig. 5C shows that Parkin dimerization is mediated by interactions between phospho-Ubl and RING0 in trans, leading to the displacement of RING2. However, Parkin K211N (on RING0) mutation perturbs interaction with phospho-Parkin and leads to loss of Parkin dimerization and loss of RING2 displacement (Fig. 5C). The interaction between pUbl and K211 pocket on RING0 leads to the displacement of RING2 resulting in Parkin activation as catalytic residue C431 on RING2 is exposed for catalysis. The biophysical experiment is further confirmed by a biochemical experiment where the addition of catalytically in-active phospho-Parkin T270R/C431A activates autoinhibited WT-Parkin in trans using the mechanism as discussed (a schematic representation also shown in Author response image 2).

      We thank this reviewer regarding Parkin isoforms. In the revised manuscript, we have included Parkin isoforms in the results section, too.

      (2a) For the in trans activation experiment using wt Parkin and pParkin (T270R/C431A) (Figure 3D), there needs to be a large excess of pParkin to stimulate the catalytic activity of wt Parkin. This experiment has low cellular relevance as these point mutations are unlikely to occur together to create this nonfunctional pParkin protein. In the case of pParkin activating wt Parkin (regardless of artificial point mutations inserted to study specifically the in trans activation), if there needs to be much more pParkin around to fully activate wt Parkin, isn't it just more likely that the pParkin would activate in cis?

      To test phospho-Parkin as an activator of Parkin in trans, we wanted to use the catalytically inactive version of phospho-Parkin to avoid the background activity of p-Parkin. While it is true that a large excess of pParkin (T270R/C431A) is required to activate WT-Parkin in the in vitro set-up, it is not very surprising as in WT-Parkin, the unphosphorylated Ubl domain would block the E2 binding site on RING1. Also, due to interactions between pParkin (T270R/C431A) molecules, the net concentration of pParkin (T270R/C431A) as an activator would be much lower. However, the Ubl blocking E2 binding site on RING1 won’t be an issue between phospho-Parkin molecules or between Parkin isoforms (lacking Ubl domain or RING2).

      (2ai) Another underlying issue with this experiment is that the authors do not consider the possibility that the increased activity observed is a result of increased "substrate" for auto-ubiquitination, as opposed to any role in catalytic activation. Have the authors considered looking at Miro as a substrate in order to control for this?

      This is quite an interesting point. However, this will be only possible if Parkin is ubiquitinated in trans, as auto-ubiquitination is possible with active Parkin and not with catalytically dead (phospho-Parkin T270R, C431A) or autoinhibited (WT-Parkin). Also, in the previous version of the manuscript, where we used only phospho-Ubl as an activator of Parkin in trans, we tested Miro1 ubiquitination and auto-ubiquitination, and the results were the same (Author response image 4).

      Author response image 4.

      (2b) The authors mention a "higher net concentration" of the "fused domains" with RING0, and use this to justify artificially cleaving the Ubl or RING2 domains from the Parkin core. This fact should be moot. In cells, it is expected there will only be a 1:1 ratio of the Parkin core with the Ubl or RING2 domains. To date, there is no evidence suggesting multiple pUbls or multiple RING2s can bind the RING0 binding site. In fact, the authors here even show that either the RING2 or pUbl needs to be displaced to permit the binding of the other domain. That being said, there would be no "higher net concentration" because there would always be the same molar equivalents of Ubl, RING2, and the Parkin core.

      We apologize for the confusion. “Higher net concentration” is with respect to fused domains versus the domain provided in trans. Due to the competing nature of the interactions between pUbl/RING2 and RING0, the interactions are too transient and beyond the detection limit of the biophysical techniques. While the domains are fused (for example, RING0-RING2 in the same polypeptide) in a polypeptide, their effective concentrations are much higher than those (for example, pUbl) provided in trans; thus, biophysical methods fail to detect the interaction. Treatment with protease solves the above issue due to the higher net concentration of the fused domain, and trans interactions can be measured using biophysical techniques. However, the nature of these interactions and conformational changes is very transient, which is also suggested by the data. Therefore, Parkin molecules will never remain associated; rather, Parkin will transiently interact and activate Parkin molecules in trans.

      (2c) A larger issue remaining in terms of Parkin activation is the lack of clarity surrounding the role of the linker (77-140); particularly whether its primary role is to tether the Ubl to the cis Parkin molecule versus a role in permitting distal interactions to a trans molecule. The way the authors have conducted the experiments presented in Figure 2 limits the possible interactions that the activated pUbl could have by (a) ablating the binding site in the cis molecule with the K211N mutation; (b) further blocking the binding site in the cis molecule by keeping the RING2 domain intact. These restrictions to the cis parkin molecule effectively force the pUbl to bind in trans. A competition experiment to demonstrate the likelihood of cis or trans activation in direct comparison with each other would provide stronger evidence for trans activation.

      This is an excellent point. In the revised manuscript, we have performed experiments using native phospho-Parkin (Revised Figure 5), and the results are consistent with those in Figure 2 ( Revised Figure 4), where we used the K211N mutation.

      (3) A major limitation of this study is that the authors interpret structural flexibility from experiments that do not report directly on flexibility. The analytical SEC experiments report on binding affinity and more specifically off-rates. By removing the interdomain linkages, the accompanying on-rate would be drastically impacted, and thus the observations are disconnected from a native scenario. Likewise, observations from protein crystallography can be consistent with flexibility, but certainly should not be directly interpreted in this manner. Rigorous determination of linker and/or domain flexibility would require alternative methods that measure this directly.

      We also agree with the reviewer that these methods do not directly capture structural flexibility. Also, rigorous determination of linker flexibility would require alternative methods that measure this directly. However, due to the complex nature of interactions and technical limitations, breaking the interdomain linkages was the best possible way to capture interactions in trans. Interestingly, all previous methods that report cis interactions between pUbl and RING0 also used a similar approach (Gladkova et al.ref. 24, Sauve et al. ref. 29).  

      (4) The analysis of the ACT element comes across as incomplete. The authors make a point of a competing interaction with Lys48 of the Ubl domain, but the significance of this is unclear. It is possible that this observation could be an overinterpretation of the crystal structures. Additionally, the rationale for why the ACT element should or shouldn't contribute to in trans activation of different Parkin constructs is not clear. Lastly, the conclusion that this work explains the evolutionary nature of this element in chordates is highly overstated.

      We agree with the reviewer that the significance of Lys48 is unclear. We have presented this just as one of the observations from the crystal structure. As the reviewer suggested, we have removed the sentence about the evolutionary nature of this element from the revised manuscript.

      (5) The analysis of the REP linker element also seems incomplete. The authors identify contacts to a neighboring pUb molecule in their crystal structure, but the connection between this interface (which could be a crystallization artifact) and their biochemical activity data is not straightforward. The analysis of flexibility within this region using crystallographic and AlphaFold modeling observations is very indirect. The authors also draw parallels with linker regions in other RBR ligases that are involved in recognizing the E2-loaded Ub. Firstly, it is not clear from the text or figures whether the "conserved" hydrophobic within the linker region is involved in these alternative Ub interfaces. And secondly, the authors appear to jump to the conclusion that the Parkin linker region also binds an E2-loaded Ub, even though their original observation from the crystal structure seems inconsistent with this. The entire analysis feels very preliminary and also comes across as tangential to the primary storyline of in trans Parkin activation.

      We agree with the reviewer that crystal structure data and biochemical data are not directly linked. In the revised manuscript, we have also highlighted the conserved hydrophobic in the linker region at the ubiquitin interface (Fig. 9C and Extended Data Fig. 11A), which was somehow missed in the original manuscript. We want to add that a very similar analysis and supporting experiments identified donor ubiquitin-binding sites on the IBR and helix connecting RING1-IBR (Kumar et al., Nature Str. and Mol. Biol., 2017), which several other groups later confirmed. In the mentioned study, the Ubl domain of Parkin from the symmetry mate Parkin molecule was identified as a mimic of “donor ubiquitin” on IBR and helix connecting RING1-IBR.

      In the present study, a neighboring pUb molecule in the crystal structure is identified as a donor ubiquitin mimic (Fig. 9C) by supporting biophysical/biochemical experiments. First, we show that mutation of I411A in the REP linker of Parkin perturbs Parkin interaction with E2~Ub (donor) (Fig. 9F). Another supporting experiment was performed using a Ubiquitin-VS probe assay, which is independent of E2. Assays using Ubiquitin-VS show that I411A mutation in the REP-RING2 linker perturbs Parkin charging with Ubiquitin-VS (Extended Data Fig. 11 B). Furthermore, the biophysical data showing loss of Parkin interaction with donor ubiquitin is further supported by ubiquitination assays. Mutations in the REP-RING2 linker perturb the Parkin activity (Fig. 9E), confirming biophysical data. This is further confirmed by mutations (L71A or L73A) on ubiquitin (Extended Data Fig. 11C), resulting in loss of Parkin activity. The above experiments nicely establish the role of the REP-RING2 linker in interaction with donor ubiquitin, which is consistent with other RBRs (Extended Data Fig. 11A).

      While we agree with the reviewer that this appears tangential to the primary storyline in trans-Parkin activation, we decided to include this data because it could be of interest to the field.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) For clarity, a schematic of the domain architecture of Parkin would be helpful at the outset in the main figures. This will help with the introduction to better understand the protein organization. This is lost in the Extended Figure in my opinion.

      We thank the reviewer for suggesting this, which we have included in Figure 1 of the revised manuscript.

      (2) Related to the competition between the Ubl and RING2 domains, can competition be shown through another method? SPR, ITC, etc? ITC was used in other experiments, but only in the context of mutations (Lys211Asn)? Can this be done with WT sequence?

      This is an excellent suggestion. In the revised Figure 5, we have performed ITC experiment using WT Parkin, and the results are consistent with what we observed using Lys211Asn Parkin.

      (3) The authors also note that "the AlphaFold model shows a helical structure in the linker region of Parkin (Extended Data Figure 10C), further confirming the flexible nature of this region"... but the secondary structure would not be inherently flexible. This is confusing.

      The flexibility is in terms of the conformation of this linker region observed under the open or closed state of Parkin. In the revised manuscript, we have explained this point more clearly.

      (4) The manuscript needs extensive revision to improve its readability. Minor grammatical mistakes were prevalent throughout.

      We thank the reviewer for pointing out this and we have corrected these in the revised manuscript.

      (5) The confocal images are nice, but inset panels may help highlight the regions of interest (ROIs).

      This is corrected in the revised manuscript.

      (6) Trans is misspelled ("tans") towards the end of the second paragraph on page 16.

      This is corrected in the revised manuscript.

      (7) The schematics are helpful, but some of the lettering in Figure 2 is very small.

      This is corrected in the revised manuscript.

      Reviewer #3 (Recommendations For The Authors):

      (1) A significant portion of the results section refers to the supplement, making the overall readability very difficult.

      We accept this issue as a lot of relevant data could not be added to the main figures and thus ended up in the supplement.  In the revised manuscript, we have moved some of the supplementary figures to the main figures.

      (2) Interpretation of the experiments utilizing many different Parkin constructs and cleavage scenarios (particularly the SEC and crystallography experiments) is extremely difficult. The work would benefit from a layout of the Parkin model system, highlighting cleavage sites, key domain terminology, and mutations used in the study, presented together and early on in the manuscript. Using this to identify a simpler system of referencing Parkin constructs would also be a large improvement.

      This is a great suggestion. We have included these points in the revised manuscript, which has improved the readability.

      (3) Lines 81-83; the authors say they "demonstrate the conformational changes in Parkin during the activation process", but fail to show any actual conformational changes. Further, much of what is demonstrated in this work (in terms of crystal structures) corroborates existing literature. The authors should use caution not to overstate their original conclusions in light of the large body of work in this area.

      We thank the reviewer for pointing out this. We have corrected the above statement in the revised manuscript to indicate that we meant it in the context of trans conformational changes.

      (4) Line 446 and 434; there is a discrepancy about which amino acid is present at residue 409. Is this a K408 typo? The authors also present mutational work on K416, but this residue is not shown in the structure panel.

      We thank the reviewer for pointing out this. In the revised manuscript, we have corrected these typos.

    1. eLife assessment

      This study presents important findings on the different polymorphs of alpha-synuclein filaments that form at various pH's during in vitro assembly reactions with purified recombinant protein. Of particular note is the discovery of two new polymorphs (1M and 5A) that form in PBS buffer at pH 7. The strength of the evidence presented is convincing. The work will be of interest to biochemists and biophysicists working on protein aggregation and amyloids.

    2. Reviewer #2 (Public Review):

      Summary:

      This is an exciting paper that explores the in vitro assembly of recombinant alpha-synuclein into amyloid filaments. The authors changed the pH and the composition of the assembly buffers, as well as the presence of different types of seeds, and analysed the resulting structures by cryo-EM.

      Strengths:

      By doing experiments at different pHs, the authors found that so-called type 2 and type-3 polymorphs form in a pH dependent manner. In addition, they find that type-1 filaments form in the presence of phosphate ions. One of their in vitro assembled type-1 polymorphs is similar to the alpha-synuclein filaments that were extracted from the brain of an individual with juvenile-onset synucleinopathy (JOS). They hypothesize that additional densities in a similar place as additional densities in the JOS fold correspond to phosphate ions.

      Comments on the revised version:

      This is OK now. I thank the authors for their constructive engagement with my comments.

    3. Reviewer #3 (Public Review):

      Summary

      The high heterogeneity nature of α-synuclein (α-syn) fibrils posed significant challenges in structural reconstruction of the ex vivo conformation. A deeper understanding of the factors influencing the formation of various α-syn polymorphs remains elusive. The manuscript by Frey et al. provides a comprehensive exploration of how pH variations (ranging from 5.8 to 7.4) affect the selection of α-syn polymorphs (specifically, Type1, 2 and 3) in vitro by using cryo-electron microscopy (cryo-EM) and helical reconstruction techniques. Crucially, the authors identify two novel polymorphs at pH 7.0 in PBS. These polymorphs bear resemblance to the structure of patient-derived juvenile-onset synucleinopathy (JOS) polymorph and diseased tissue amplified α-syn fibrils. The revised manuscript more strongly supports the notion that seeding is a non-polymorph-specific in the context of secondary nucleation-dominated aggregation, underscoring the irreplaceable role of pH in polymorph formation.

      Strengths

      This study systematically investigates the effects of environmental conditions and seeding on the structure of α-syn fibrils. It emphasizes the significant influence of environmental factors, especially pH, in determining the selection of α-syn polymorphs. The high-resolution structures obtained through cryo-EM enable a clear characterization of the composition and proportion of each polymorph in the sample. Collectively, this work provides a strong support for the pronounced sensitivity of α-syn fibril structures to the environmental conditions and systematically categorizes previously reported α-syn fibril structures. Furthermore, the identification of JOS-like polymorph also demonstrates the possibility of in vitro reconstruction of brain-derived α-syn fibril structures.

      Weaknesses

      All my previous concerns have been resolved to my satisfaction.

    4. Author response:

      The following is the authors’ response to the previous reviews.

      Revisions Round 1

      Reviewer #1

      We thank the reviewer for their careful reading of our manuscript and have taken all of their grammatical corrections into account.

      Reviewer #2 (Public Review): 

      Weaknesses: 

      The paper contains multiple instances of non-scientific language, as indicated below. It would also benefit from additional details on the cryo-EM structure determination in the Methods and inclusion of commonly accepted requirements for cryo-EM structures, like examples of 2D class averages, raw micrographs, and FSC curves (between half-maps as well as between rigid-body fitted (or refined) atomic models of the different polymorphs and their corresponding maps). In addition, cryo-EM maps for the control experiments F1 and F2 should be presented in Figure 9.

      We tried to correct the non-scientific language and have included the suggested data on the Cryo-EM analyses including new Figures 11-17.  We did not collect data on the sample used for the seeds in the cross seeding experiments because we had already confirmed in multiple datasets that the conditions in F1 and F2 reproducibly produce fibrils of Type 1 and Type 3, respectively. We have now analyzed cryo-EM data for 6 more samples at pH 7.0 and found that several kinds of polymorphs (Types 1A, 1M, 2A, 2B and 5) are accessible at this pH, however the Type 3 polymorphs are not formed at pH 7.0 under the conditions that we used for aggregation.

      Reviewer #2 (Recommendations For The Authors): 

      - remove unscientific language: "it seems that there are about as many unique atomic-resolution structures of these aggregates as there are publications describing them"   

      We have rephrased this sentence.

      - for same reason, remove "Obviously, " 

      Done

      - What does this mean? “polymorph-unspecific” 

      Rephrased as non-polymorph-specific

      - What does this mean? "shallow amyloid energy hypersurface"  

      By “shallow hypersurface” we mean that the minimum of the multi-dimensional function that describes the energy of the amyloid is not so deep that subtle changes to the environment will not favor another fold/energy minimum. We have left the sentence because while it may not be perfect, it is concise and seems to get the point across.

      - "The results also confirm the possibility of producing disease-relevant structure in vitro." -> This is incorrect as no disease-relevant structure was replicated in this work. Use another word like “suggest”.

      We have changed to “suggest” as suggested.

      - Remove "historically" 

      Done

      - Rephrase “It has long been understood that all amyloids contain a common structural scaffold” 

      Changed to “It has long been established that all amyloids contain a common structural scaffold..” 

      - "Amyloid polymorphs whose differences lie in both their tertiary structure (the arrangement of the beta-strands) and the quaternary structure (protofilamentprotofilament assembly) have been found to display distinct biological activities [8]" -> I don't think this is true, different biological activities of amyloids have never been linked to their distinct structures.  

      We have added 5 new references (8-12) to support this sentence.

      - Reference 10 is a comment on reference 9; it should be removed. Instead, as for alpha-synuclein, all papers describing the tau structures should be included.  

      We have removed the reference, but feel that the addition of all Tau structure references is not merited in this manuscript since we are not comparing them.

      - Rephrase: "is not always 100% faithful"

      Removed “100%”

      - What is pseudo-C2 symmetry? Do the authors mean pseudo 2_1 symmetry (ie a 2-start helical symmetry)?

      Thank for pointing this out.  We did indeed mean pseudo 21 helical symmetry.  

      - Re-phrase: "alpha-Syn's chameleon-like behavior" 

      We have removed this phrase.

      - "In the case of alpha-Syn, the secondary nucleation mechanism is based on the interaction of the positively charged N-terminal region of monomeric alpha-Syn and the disordered, negatively charged C-terminal region of the alpha-Syn amyloid fibrils [54]" -> I would say the mechanisms of secondary nucleation are not that well understood yet, so one may want to tune this down a bit. 

      We have changed this to “mechanism has been proposed to be”

      - The paragraphs describing experiments by others are better suited for a Discussion rather than a Results section. Perhaps re-organize this part? 

      We have left the text intact as we are using a Results and Discussion format.

      - A lot of information about Image processing seems to be missing: what steps were performed after initial model generation? 

      We have added more details in the methods section on the EM data processing and model analysis.

      - Figure 1: Where is Type 4 on the pH scale?

      We have adjusted the Fig 1 legend to clarify that pH scale is only applicable to the structures presented in this manuscript. 

      - Figure 2: This might be better incorporated as a subpanel of Figure 1.

      We agree that this figure is somewhat of a loner on its own and we only added it in order to avoid confusion with the somewhat inconsistent naming scheme used for the Type 1B structure. However, we prefer to leave it as a separate figure so that it does not get dilute the impact of figure 1.

      - Figure 3: What is the extra density at the bottom of Type 3B from pH 5.8 samples 1 and 2. pH 5.8 + 50mM NaCl (but not pH 5.8 + 100 mM NaCl)? Could this be an indication of a local minimum and the pH 5.8 + 100 mM NaCl structure is correct? Or is this a real difference between 0/50mM NaCl and 100 mM NaCl? 

      We did not see the extra density to which the reviewer is referring, however the images used in this panel are the based on the output of 3D-classification which is more likely to produce more artifacts than a 3D refinement. With this in mind, we did not see any significant differences in the refined structures and therefore only deposited the better quality map and model for each of the polymorph types.

      - Figure 3: To what extent is Type 3B of pH 6.5 still a mixture of different types? The density looks poor. In general, in the absence of more details about the cryo-EM maps, it is hard to assess the quality of the structures presented.  

      In order to improve the quality of the images in this panel, a more complete separation of the particles from each polymorph was achieved via the filament subset selection tool in RELION 5. In each case, an unbiased could be created from the 2D classes via the relion_helix_inimodel2D program, further supporting the coexistence of 4 polymorphs in the pH 6.5 sample. The particles were individually refined to produce the respective maps that are now used in this figure.

      - Many references are incorrect, containing "Preprint at (20xx)" statements.

      This has been corrected.  

      Reviewer #3 (Public Review): 

      Weaknesses: 

      (1) The authors reveal that both Type 1 monofilament fibril polymorph (reminiscent of JOSlike polymorph) and Type 5 polymorph (akin to tissue-amplified-like polymorph) can both form under the same condition. Additionally, this condition also fosters the formation of flat ribbon-like fibril across different batches. Notably, at pH 5.8, variations in experimental groups yield disparate abundance ratios between polymorph 3B and 3C, indicating a degree of instability in fibrillar formation. The variability would potentially pose challenges for replicability in subsequent research. In light of these situations, I propose the following recommendations: 

      (a) An explicit elucidation of the factors contributing to these divergent outcomes under similar experimental conditions is warranted. This should include an exploration of whether variations in purified protein batches are contributing factors to the observed heterogeneity.

      We are in complete agreement that understanding the factors that lead to polymorph variability is of utmost importance (and was the impetus for the manuscript itself). However the number of variables to explore is overwhelming and we will continue to investigate this in our future research. Regarding the variability between batches of purified protein, we also think that this could be a factor in the polymorph variability observed for otherwise “identical” aggregation conditions, particularly at pH 7 where the largest variety of polymorphs have been observed. However, even variation between identical replicates (samples created from the same protein solution and simply aggregated simultaneously in separate tubes) can lead to different outcomes (see datasets 15 and 16 in the revised Table 1) suggesting that there are stochastic processes that can determine the outcome of an individual aggregation experiment. While our data still indicates that Type 1,2 and 3 polymorphs are strongly selected by pH, the selection between interface variants 3B vs. 3C and 2A vs. 2B might also be affected by protein purity. Our standard purification protocol produces a single band by coomassie-stained SDS-PAGE however minor truncations and other impurities below a few percent would go undetected and, given the proposed roles of the N and C-termini in secondary nucleation, could have a large effect on polymorph selection and seeding. In line with the reviewer’s comments we now include a batch number for each EM dataset. While no new conclusions can be drawn from the inclusion of this additional data, we feel that it is important to acknowledge the possible role of batch to batch variability. 

      (b) To enhance the robustness of the conclusions, additional replicates of the experiments under the same condition should be conducted, ideally a minimum of three times.  

      The pH 5.8 conditions that yield Type 3 fibrils has already been repeated several times in the original manuscript. Since the pH 7.4 conditions produce the most common a-Syn polymorph (Type 1A) and were produced twice in this manuscript (once as an unseeded and once as a cross-seeded fibrilization) we decided to focus on the intermediate condition where the most variability had been seen (pH 7.0). The revised table 1 now has 6 new datasets (11-16) representing 6 independent aggregations at pH 7.0 starting from two different protein purification batches. The results is that we now produce the type 2A/B polymorphs in three samples and in two of these samples we once again observed the type 1M polymorph.  The other samples produced Type 1A or non-twisted fibrils.

      (c) Further investigation into whether different polymorphs formed under the same buffer condition could lead to distinct toxicological and pathology effects would be a valuable addition to the study.  

      The correlation of toxicity with structure would in principle be interesting. However the Type 1 and Type 3 polymorphs formed at pH 5.8 and 7.4 are not likely to be biologically relevant. The pH 7 polymorphs (Type 5 and 1M) would be more interesting because they form under the same conditions and might be related to some disease relevant structures. Still, it is rare that a single polymorph appears at 7.0 (the Type 5 represented only 10-20% of the fibrils in the sample and the Type 1M also had unidentified double-filament fibrils in the sample). We plan to pursue this line of research and hope to include it in a future publication.

      (2) The cross-seeding study presented in the manuscript demonstrates the pivotal role of pH conditions in dictating conformation. However, an intriguing aspect that emerges is the potential role of seed concentration in determining the resultant product structure. This raises a critical question: at what specific seed concentration does the determining factor for polymorph selection shift from pH condition to seed concentration? A methodological robust approach to address this should be conducted through a series of experiments across a range of seed concentrations. Such an approach could delineate a clear boundary at which seed concentration begins to predominantly dictate the conformation, as opposed to pH conditions. Incorporating this aspect into the study would not only clarify the interplay between seed concentration and pH conditions, but also add a fascinating dimension to the understanding of polymorph selection mechanisms.

      A more complete analysis of the mechanisms of aggregation, including the effect of seed concentration and the resulting polymorph specificity of the process, are all very important for our understanding of the aggregation pathways of alpha-synuclein and are currently the topic of ongoing investigations in our lab.

      Furthermore, the study prompts additional queries regarding the behavior of cross-seeding production under the same pH conditions when employing seeds of distinct conformation. Evidence from various studies, such as those involving E46K and G51D cross-seeding, suggests that seed structure plays a crucial role in dictating polymorph selection. A key question is whether these products consistently mirror the structure of their respective seeds. 

      We thank the reviewer for reminding us to cite these studies as a clear example of polymorph selection by cross-seeding. Unfortunately, it is not 100% clear from the G51D cross seeding manuscript (https://doi.org/10.1038/s41467-021-26433-2) what conditions were used in the cross-seeding since different conditions were used for the seedless wild-type and mutant aggregations… however it appears that the wildtype without seeds was Tris pH 7.5 (although at 37C the pH could have dropped to 7ish) and the cross-seeded wild-type was in Phosphate buffer at pH 7.0. In the E46K cross-seeding manuscript, it appears that pH 7.5 Tris was used for all fibrilizations (https://doi.org/10.1073/pnas.2012435118).  In any event, both results point to the fact that at pH 7.0-7.5 under low-seed conditions (0.5%) the Type 4 polymorph can propagate in a seed specific manner.

      (3) In the Results section of "The buffer environment can dictate polymorph during seeded nucleation", the authors reference previous cell biological and biochemical assays to support the polymorph-specific seeding of MSA and PD patients under the same buffer conditions. This discussion is juxtaposed with recent research that compares the in vivo biological activities of hPFF, ampLB as well as LB, particularly in terms of seeding activity and pathology. Notably, this research suggests that ampLB, rather than hPFF, can accurately model the key aspects of Lewy Body Diseases (LBD) (refer to: https://doi.org/10.1038/s41467-023-42705-5). The critical issue here is the need to reconcile the phenomena observed in vitro with those in in-vivo or in-cell models. Given the low seed concentration reported in these studies, it is imperative for the authors to provide a more detailed explanation as to why the possible similar conformation could lead to divergent pathologies, including differences in cell-type preference and seeding capability.  

      We thank the reviewer for bring this recent report to our attention. The findings that ampLB and hPFF have different PK digestion patterns and that only the former is able to model key aspects of Lewy Body disease are in support of the seed-specific nature of some types of alpha-synuclein aggregation.  We have added this to the discussion regarding the significant role that seed type and seed conditions likely play in polymorph selection.

      (4) In the Method section of "Image processing", the authors describe the helical reconstruction procedure, without mentioning much detail about the 3D reconstruction and refinement process. For the benefit of reproducibility and to facilitate a deeper understanding among readers, the authors should enrich this part to include more comprehensive information, akin to the level of detail found in similar studies (refer to: https://doi.org/10.1038/nature23002).

      As also suggested by reviewer #2, we have now added more comprehensive information on the 3D reconstruction and refinement process.

      (5) The abbreviation of amino acids should be unified. In the Results section "On the structural heterogeneity of Type 1 polymorphs", the amino acids are denoted using three-letter abbreviation. Conversely, in the same section under "On the structural heterogeneity of Type 2 and 3 structures", amino acids are abbreviated using the one-letter format. For clarity and consistency, it is essential that a standardized format for amino acid abbreviations be adopted throughout the manuscript.

      That makes perfect sense and had been corrected.

      Reviewing Editor: 

      After discussion among the reviewers, it was decided that point 2 in Reviewer #3's Public Review (about the experiments with different concentrations of seeds) would probably lie outside the scope of a reasonable revision for this work. 

      We agree as stated above and will continue to work on this important point.

      Revisions Round 2

      Reviewer #2 (Public Review): 

      I do worry that the FSC values of model-vs-map appear to be higher than expected from the corresponding FSCs between the half-maps (e.g. see Fig 13). The implication of this observation is that the atomic models may have been overfitted in the maps, which would have led to a deterioration of their geometry. A table with rmsd on bond lengths, angles, etc would probably show this. In addition, to check for overfitting, the atomic model for each data set could be refined in one of the half-maps, and then that same model could be used to calculate 2 FSC model-vs-map curves: one against the half-map it was refined in and one against the other half-map. Deviations between these two curves are an indication of overfitting. 

      Thank you for the recommendations for model validation.  We have added the suggested statistics to Table 2 and performed the suggested model fitting to one of the half-maps and plotted 3 FSC model-vs-map curves: one for each half-map versus the model fit against only one half map and one for the model fit against the full map. We feel that the degree of overfitting is reasonable and does not  significantly impact the quality of the models. 

      In addition, the sudden drop in the FSC curves in Figure 16 shows that something unexpected has happened to this refinement. Are the authors sure that only the procedures outlined in the Methods were used to create these curves? The unexpected nature of the FSC curve for this type (2A) raises doubts about the correctness of the reconstruction. 

      We thank the reviewer for the attention to detail.  We should have caught this mistake. It turns out that in the last round of 3D refinement, the two half-maps become shifted with respect to each other in the z direction. We realigned the two maps using Chimera and then re-ran the postprocessing. The new maps have been deposited in EMD-50850. This mistake motivated us to inspect all of the maps and we found the same problem had occurred in the Type 3B maps.  This was not noticed by the reviewer because we accidentally plotted the FSC curves from postprocessing from one refinement round before the one deposited in the EMD. We performed the same half-map shifting procedure for the Type 3B data and performed a final round of real-space refinement to produce new maps and models that have been deposited as EMD-50888 and 9FYP (superseding the previous entries).

      Reviewer #3 (Public Review): 

      There are two minor points I recommend the authors to address: 

      (1) In the response to Weakness 1, point (3), the authors state that "the Type 5 represented only 10-20% of the fibrils in the sample." However, this information is not labeled in the corresponding Figure 4. I suggest the authors verify and label all relevant percentages in the figures to prevent misunderstandings. 

      We aim to be as transparent as possible and this information was included in the main text however we did not label the percentage of Type 5 fibrils in Figure 4 because that would make the other percentages ambiguous.  The percentages in Figure 4 represent the ratio of helical segments used for each type of refined structure in the dataset (always adding up to 100%), not the percent of all fibrils in the dataset.  That is, there are sometimes untwisted or unidentifiable fibrils in datasets and these were not accounted for in the listed percentages. We have added a sentence to the Figure 4 legend to explain to what the percentages refer.

      (2) While the authors have detailed the helical reconstruction procedure in the Methods section, it is necessary to indicate the scale bar or box size in the figure legend of the 2D representative classes to ensure clarity and reproducibility. 

      Thank you for reminding us to add the scale bars. This is now done for the 2D classes in Figures 11-17.

      Recommendations for the authors: 

      Reviewer #2 (Recommendations For The Authors): 

      A critical look at the maps and models of the various structures at this stage may prevent the authors from entering suboptimal structures into the databases.  

      We agree. Thank you for suggesting this.

      Reviewer #3 (Recommendations For The Authors): 

      The authors have responded adequately to these critiques in the revised version of the manuscript. There are two minor points. 

      (1) The authors state that "the Type 5 represented only 10-20% of the fibrils in the sample." However, this information is not labeled in the corresponding Figure 4. I suggest the authors verify and label all relevant percentages in the figures to prevent misunderstandings. 

      (2) While the authors have detailed the helical reconstruction procedure in the Methods section, it is necessary to indicate the scale bar or box size in the figure legend of the 2D representative classes to ensure clarity and reproducibility. 

      Answered in public comments

    1. eLife assessment

      Using multiple public datasets, this study investigates associations between retrotransposon element expression and methylation with age and inflammation. The study is valuable because a systematic analysis of retrotransposon element expression during human aging has been lacking, but the provided data must be considered incomplete due to the sole reliance on microarray expression data for the core analyses.

    2. Reviewer #1 (Public Review):

      Tsai and Seymen et al. investigate associations between RTE expression and methylation and age and inflammation, using multiple public datasets. Compared to the previous round of review, the text of the manuscript has been polished and the phrasing of several findings has been made clearer and more precise. The authors also provided ample discussion to the prior reviewer comments in their rebuttal, including new analyses. All these changes are in the correct direction, however, I believe that part of the content of the rebuttal should be incorporated in the main text, for reasons that I will outline below.

      Both reviewers found the reliance on microarray expression data to detract from the study. The authors argued that their choices are supported by existing publications which performed a similar quantification of TE expression using microarray data. It could still be argued that (as far as I can tell) Reichmann et al. used a substantially larger number of probes than this study, as a consequence of starting from different arrays, however, this is a minor point which the authors do not need to address. It is still undeniable that including the validation with RNA-seq data performed in the rebuttal would strengthen the manuscript. I especially believe that many readers would want to see this analysis be prominent in the manuscript, considering that both reviewers independently converged on the issue with microarray expression data. Personally, I would have included an RNA-seq dataset next to the microarray data in the main figures, however, I understand that this would require considerable restructuring and that placing RNA-seq data besides array data might be misleading. Instead, I would ask that the authors include their rebuttal figures R1 and R2 as supplementary figures.<br /> I would suggest introducing a new paragraph, between the section dedicated to expression data and the one dedicated to DNA methylation, mentioning the issues with microarray data (Some of which were mentioned by the reviewers and other which were mentioned by the authors in the discussion and introduction) to then introduce the validation with RNA-seq data.

      Figure R3 is also a good addition and should be expanded to include the GTP and MESA study and possibly mentioned in the paragraph titled "RTE expression positively correlates with BAR gene signature scores except for SINEs."

      "In this study, we did not compare MESA with GTP etc. We have analysed each dataset separately based on the available data for that dataset. Therefore, sacrificing one analysis because of the lack of information from the other does not make sense. We would do that if we were after comparing different datasets. Moreover, the datasets are not comparable because they were collected from different types of blood samples."

      Indeed, the datasets are not compared directly, but the associations between age, BER and TE expression for each dataset are plotted and discussed right next to each other. It is therefore natural to wonder if the differences between datasets are due to differences in the type of blood sample or if they are a consequence of the different probe sets. Using a common set of probes would help answer that question.

    3. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment 

      This study investigates associations between retrotransposon element expression and methylation with age and inflammation, using multiple public datasets. The study is valuable because a systematic analysis of retrotransposon element expression during human aging has been lacking. However, the data provided are incomplete due to the sole reliance on microarray expression data for the core analysis of the paper. 

      Both reviewers found this study to be important. We have selected the microarray datasets of human blood adopted by a comprehensive study of ageing published in a Nature

      Communications manuscript (DOI: doi: 10.1038/ncomms9570). We only included the datasets specifically collected for ageing studies. Therefore, the large RNA-seq cohorts for cancer, cardiovascular, and neurological diseases were not relevant to this study and cannot be included.   

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      Tsai and Seymen et al. investigate associations between RTE expression and methylation and age and inflammation, using multiple public datasets. The concept of the study is in principle interesting, as a systematic analysis of RTE expression during human aging is lacking. 

      We thank the reviewer for the positive comment. 

      Unfortunately, the reliance on expression microarray data, used to perform the core analysis of the paper places much of the study on shaky ground. The findings of the study would not be sufficiently supported until the authors validate them with more suitable methods. 

      In our discussion section in the manuscript, we have clarified that “we are aware of the limitations imposed by using microarray in this study, particularly the low number of intergenic probes in the expression microarray data. Our study can be enriched with the advent of large  RNA-seq cohorts for aging studies in the future.”  However, the application of microarray for RTE expression analysis was introduced previously (DOI: 10.1371/journal.pcbi.1002486) and applied in some highly cited and important publications before (DOI: 10.1038/ncomms1180, DOI: 10.1093/jnci/djr540). In fact, in a manuscript published by Reichmann et al.  (DOI: 10.1371/journal.pcbi.1002486) which was cited 76 times, the authors showed and experimentally verified that cryptic repetitive element probes present in Illumina and Affymetrix gene expression microarray platforms can accurately and sensitively monitor repetitive element expression data. Inspired by this methodological manuscript with reasonable acceptance by other researchers, we trusted that the RTE microarray probes could accurately quantify RTE expression at class and family levels.

      Strengths: 

      This is a very important biological problem. 

      Weaknesses: 

      RNA microarray probes are obviously biased to genes, and thus quantifying transposon analysis based on them seems dubious. Based on how arrays are designed there should at least be partial (perhaps outdated evidence) that the probe sites overlap a protein-coding or non-coding RNA. 

      We disagree with the reviewer that quantifying transposon analysis based on microarray data is dubious. As previously shown by Reichmann et al., the quantification is reliable as long as the probes do not overlap with annotated genes and they are in the correct orientation to detect sense repetitive element transcripts. Reichman et al. identified 1,400 repetitive element probes in version 1.0, version 1.1 and version 2.0 of the Illumina Mouse WG-6 Beadchips by comparing the genomic locations of the probes with the Repeatmasked regions of the mouse genome. We applied the same criteria for Illumina Human HT-12 V3 (29431 probes) and V4 (33963) to identify the RTE-specific probes. 

      The authors state they only used intergenic probes, but based on supplementary files, almost half of RTE probes are not intergenic but intronic (n=106 out of 264). 

      All our identified RTE probes overlap with intergenic regions. However, due to their repetitive natures, some probes overlap with intronic regions, too. We have replaced "intergenic" with "non-coding" in our resubmission to show that they do not overlap with the exons of protein-coding genes. However, we do not rule out the possibility that some of our detected RTE probes might overlap non-coding RNAs. In fact, the border between coding and non-coding genomes has recently become very fuzzy with new annotations of the genome. RTE RNAs can be easily considered as non-coding RNAs if we challenge our traditional junk DNA view. 

      This is further complicated by the fact that not all this small subset of probes is available in all analyzed datasets. For example, 232 probes were used for the MESA dataset but only 80 for the GTP dataset. Thus, RTE expression is quantified with a set of probes which is extremely likely to be highly affected by non-RTE transcripts and that is also different across the studied datasets. Differences in the subsets of probes could very well explain the large differences between datasets in multiple of the analyses performed by the authors, such as in Figure 2a, or 3a. It is nonetheless possible that the quantification of RTE expression performed by the authors is truly interpretable as RTE expression, but this must be validated with more data from RNA-seq. Above all, microarray data should not be the main type of data used in the type of analysis performed by the authors. 

      In this study, we did not compare MESA with GTP etc. We have analysed each dataset separately based on the available data for that dataset. Therefore, sacrificing one analysis because of the lack of information from the other does not make sense. We would do that if we were after comparing different datasets. Moreover, the datasets are not comparable because they were collected from different types of blood samples. 

      Reviewer #2 (Public Review): 

      Summary: 

      Yi-Ting Tsai and colleagues conducted a systematic analysis of the correlation between the expression of retrotransposable elements (RTEs) and aging, using publicly available transcriptional and methylome microarray datasets of blood cells from large human cohorts, as well as single-cell transcriptomics. Although DNA hypomethylation was associated with chronological age across all RTE biotypes, the authors did not find a correlation between the levels of RTE expression and chronological age. However, expression levels of LINEs and LTRs positively correlated with DNA demethylation, and inflammatory and senescence gene signatures, indicative of "biological age". Gene set variation analysis showed that the inflammatory response is enriched in the samples expressing high levels of LINEs and LTRs. In summary, the study demonstrates that RTE expression correlates with "biological" rather than "chronological" aging. 

      Strengths: 

      The question the authors address is both relevant and important to the fields of aging and transposon biology. 

      We thank the reviewer for finding this study relevant and important.

      Weaknesses: 

      The choice of methodology does not fully support the primary claims. Although microarrays can detect certain intergenic transposon sequences, the authors themselves acknowledge in the Discussion section that this method's resolution is limited. More critical considerations, however, should be addressed when interpreting the results. The coverage of transposon sequences by microarrays is not only very limited (232 unique probes) but also predetermined. This implies that any potential age-related overexpression of RTEs located outside of the microarray-associated regions, or of polymorphic intact transposons, may go undetected. Therefore, the authors should be more careful while generalising their conclusions. 

      This is a bioinformatics study, and we have already admitted and discussed the limitations in the discussion section of this manuscript. All technologies have their own limitations, and this should not stop us from shedding light on scientific facts because of inadequate information. In the manuscript, we have discussed that all large and proper ageing studies were performed using microarray technology. Peters et al. (DOI: doi: 10.1038/ncomms9570) adopted all these datasets in their transcriptional landscape of ageing manuscript, which was used in previous studies of ageing as well. Our study essentially applies the Reichmann et al. method to the peripheral blood-related data from the Peters et al. manuscript. Since hypomethylation due to ageing is a well-established and broad epigenetic reprogramming, it is unlikely that only a fraction of RTEs is affected by this phenomenon. Therefore, the subsampling of RTEs should not affect the result so much. Indeed, this is supported in our study by the inverse correlation between DNA methylation and RTE expression for LINE and SINE classes despite having limited numbers of probes for LINE and SINE expressions.    

      Additionally, for some analyses, the authors pool signals from RTEs by class or family, despite the fact that these groups include subfamilies and members with very different properties and harmful potentials. For example, while sequences of older subfamilies might be passively expressed through readthrough transcription, intact members of younger groups could be autonomously reactivated and cause inflammation. The aggregation of signals by the largest group may obscure the potential reactivation of smaller subgroups. I recommend grouping by subfamily or, if not possible due to the low expression scores, by subgroup. For example, all HERV subfamilies are from the ERVL family. 

      We agree with the reviewer that different subfamilies of RTEs play different roles through their activation. However, we will lose our statistical power if we study RTE subfamilies with a few probes. Global epigenetic alteration and derepression of RTEs by ageing have been observed to be genome-wide. While our systematic analysis across RTE classes and families cannot capture alterations in subfamilies due to statistical power, it is still relevant to the research question we are addressing.

      Next, Illumina arrays might not accurately represent the true abundance of TEs due to nonspecific hybridization of genomic transposons. Standard RNA preparations always contain traces of abundant genomic SINEs unless DNA elimination is specifically thorough. The problem of such noise should be addressed. 

      We have checked the RNA isolation step from MESA, GTP, and GARP manuscripts. The total RNA was isolated using the Qiagen mini kit following the manufacturer’s recommendations. The authors of these manuscripts did not mention whether they eliminated genomics DNA, but we assumed they were aware of the DNA contamination and eliminated it based on the manufacturer’s recommendations. We have looked up the literature about nonspecific hybridization of RTEs but could not find any evidence to support this observation. We would appreciate the reviewers providing more evidence about such RTE contaminations.   

      Lastly, scRNAseq was conducted using 10x Genomics technology. However, quantifying transposons in 10x sequencing datasets presents major challenges due to sparse signals. 

      Applying the scTE pipeline (https://www.nature.com/articles/s41467-021-21808-x), we have found that the statical power of quantifying RTE classes (LINE, SINE, and LTR) or  RTE families (L1, L2, All, ERVK, etc.) are as good as each individual gene. However, our proposed method cannot analyse RTE subfamilies, and we did not do that. 

      Smart-seq single-cell technology is better suited to this particular purpose. 

      We agree with the reviewer that Smart-seq provides higher yield than 10x, but there is no Smartseq data available for ageing study.  

      Anyway, it would be more convincing if the authors demonstrated TE expression across different clusters of immune cells using standard scRNAseq UMAP plots instead of boxplots. 

      Since the number of RTE reads per cell is low, showing the expression of RTEs per cell in UMAP may not be the best statistical approach to show the difference between the aged and young groups. This is why we chose to analyse with Pseudobulk and displayed differential expression using boxplot rather than UMAP for each immune cell type. 

      I recommend validating the data by RNAseq, even on small cohorts. Given that the connection between RTE overexpression and inflammation has been previously established, the authors should consider better integrating their observations into the existing knowledge. 

      Please see below. We have analysed RNA-seq data suggested by Reviewer 1 in the Recommendations for the Authors section.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      I can recommend two sizeable human PMBC RNA-seq datasets that the authors could use:

      Marquez et al. 2020 (phs001934.v1.p1, controlled access) and Morandini et al. 2023 (GSE193141, public access). There are likely other suitable datasets that I am not aware of. I would also recommend using identical sets of probes to quantify RTE expression across studies. If certain datasets have too few probes and would thus limit the number of probes available across all studies it might be a good idea to exclude the dataset, especially if the analysis has been supplemented by the additional RNA-seq datasets. 

      Until recently, there was no publicly-available, non-cancerous, large cohort of RNA-seq data for ageing studies. We tried to gain access to the two RNA-seq datasets suggested by reviewer 2: Marquez et al. 2020 (phs001934.v1.p1, controlled access) and Morandini et al. 2023 (GSE193141, public access). 

      Unfortunately, Marquez et al. 2020 data is not accessible because the authors only provide the data for projects related to cardiovascular diseases. However, we did analyse Morandini et al. 2023 data, and we can confirm that no association was observed between any class and family of RTEs with chronological ageing (Author response image 1), which is the second strong piece of evidence supporting the statement in the manuscript. However, as expected, we found a positive correlation between RTE expression and IFN-I signature score (Author response image 2).

      Author response image 1.

      Linear analysis of RTE expression and chronological age.

      Author response image 2.

      Linear analysis of RTE expression and IFN gene signature expression.

      The authors use "biological age" and inflammation as interchangeable concepts, including in the title. Please correct this wording. 

      We have now added a new terminology to the manuscript called “biological age-related (BAR)”, which has been clearly addressed this distinction. We don’t think it is needed to change the title.  

      The authors find correlations between RTE expression and age-associated gene signatures but not chronological age itself. This is puzzling because, as the wording suggests, the expression of these inflammatory pathways is age-associated. If RTE expression correlates with inflammation which itself correlates with age, one might expect RTE expression to also correlate with age. Do the authors see a correlation between various inflammatory gene signatures and chronological age, in the analyzed datasets? If yes, then how would you explain that discrepancy? Moreover, in this case, I would recommend using a linear model, rather than correlation, to separate the effects of chronological age and RTE expression on inflammation (Inflammation et al ~ Age + RTE expression), or equivalent designs.

      As described above, we have now introduced the BAR terminology, which resolves this confusion. We did not find a correlation between RTE expression and chronological age. However, we did identify the correlation between BAR gene signatures and RTE expression.

      To separate the effects of chronological age and RTE expression on BAR gene signature scores, we performed a generalized linear model (GLM) analysis using BAR gene signature scores as response variables and RTE expression and chronological age as predictors (BAR gene signature scores ~ RTE expression + chronological age). Significant association was observed between BAR gene signature scores and RTE expression in the GARP cohort (Author response image 3). However, when chronological age is considered as predictor, we did not identify a correlation between chronological age and BAR gene signatures, indicating that BAR events are not corelated with chronological age (Author response image 3).  

      Author response image 3.

      Generalized linear models (GLM) analysis (BAR gene signature scores ~ RTE expression + chronological age). For each RTE family, we separately performed GLM. Age (RTE family) indicates the chronological age when used in the design formula for that specific RTE family. 

      Some of the gene sets used by the authors have considerable overlap with others and are also not particularly comprehensive. I can recommend this very comprehensive gene set: https://www.gsea-msigdb.org/gsea/msigdb/human/geneset/SAUL_SEN_MAYO.  

      We did not choose to use large gene lists such as the suggested SEN_MAYO list, as we found Singscore struggles to generate reliable scores with sufficient variance when the number of genes increase to more than twenty. Although there is some overlap between inflammation-related genes and cellular senescence genes (e.g., IL6, IL1A, IL1B), it is important to note that each gene list focuses on different aspects of biological aging and should not be dismissed as redundant.

      Minor comments: 

      Overall, several sentences in the manuscript feel somewhat unnatural. I would recommend further proofreading. I will mention some examples:  

      Thank you for your feedback. We have fixed all these issues in the new submission.  

      • One line 34, "like the retroviruses" should be "like retroviruses. There are several other places in the text where "the" is not required. 

      Fixed.

      • On line 86, "to generate the RTE expression". "the" is again not necessary and I would replace "generate" with "quantify". 

      Fixed.

      • On line 86, "we mapped the probe locations to RepeatMasker". RepeatMasker is not a genome. Do you mean you mapped the probe location to a genome annotated by RepeatMasker? The same applies to line 99.  

      Fixed. We changed the sentence to: “To quantify RTE expression, we mapped the microarray probe locations to RTE locations in RepeatMasker to extract the list of noncoding (intergenic or intronic) probes that cover the RTE regions.”

      • Figure 1 contains a typo in the aims section: "evetns" instead of "events".  

      Fixed.

      • On line 495 "filtered out" seems to imply your removed intergenic probes. I assume you mean that you specifically selected intergenic probes. 

      Fixed.

      • Figure 1 nicely summarizes your datasets. Could you add a Figure 1b panel showing how you used RNA arrays to quantify RTE expression? This should include the number of probes for each RTE family, so I suggest merging this with Figure S1.  

      We disagree with the reviewer to merge Figure 1 and Figure S1 because they are addressing two different concepts.  

      Reviewer #2 (Recommendations For The Authors): 

      In Figure 2c, it is unclear what colour scale has been used for age. 

      Thank you for the comment. We have added a legend for age in this figure.

      There are no figure legends for Supplementary Figures 1 to 5 and all figures after Supplementary Figure 8. 

      A new version with legends has been submitted.

      For different datasets used, the choice of "healthy" patients should be more clear and explicit.

      Are asymptomatic patients with autoimmune inflammatory disorders considered as "healthy"? If not only healthy patients' blood is analysed (such as PBMS from primary osteoarthrosis), how inflammatory signatures enrichment discovered in this study may be associated not just with "biological age" but with the disease itself? 

      In our analysis, we did not exclusively study "healthy" individuals, as none of our datasets were initially collected from strictly healthy populations. While the microarray datasets were not specifically collected from people with particular diseases, they were also not screened for asymptomatic conditions. To demonstrate the same pattern in healthier cohorts, we added scRNA-seq analysis of confirmed healthy individuals to our study. However, the focus of this study is not on healthy aging. Instead, it is on biological ageing that includes both healthy and non-healthy ageing.

      We included the GARP (primary osteoarthritis) dataset as it is a cohort of age-related diseases (ARD). While we cannot definitively attribute inflammatory signatures enrichment to biological aging or disease, the observation of such enrichment in a cohort of ARD is worth considering. To make this clearer, we have replaced the term “healthy” with “non-cancerous” for microarray analysis throughout the paper.

    1. Reviewer #1 (Public Review):

      In this study, the authors introduced an essential role of AARS2 in maintaining cardiac function. They also investigated the underlying mechanism that through regulating alanine and PKM2 translation are regulated by AARS2. Accordingly, a therapeutic strategy for cardiomyopathy and MI was provided. Several points need to be addressed to make this article more comprehensive:

      (1) Include apoptotic caspases in Figure 2B, and Figure 4 B and E as well.

      (2) It would be better to show the change of apoptosis-related proteins upon the knocking down of AARS2 by small interfering RNA (siRNA).

      (3) In Figure 5, the authors performed Mass Spectrometry to assess metabolites of homogenates. I was wondering if the change of other metabolites could be provided in the form of a heatmap.

      (4) The amounts of lactate should be accessed using a lactate assay kit to validate the Mass Spectrometry results.

      (5) How about the expression pattern of PKM2 before and after mouse MI. Furtherly, the correlation between AARS2 and PKM2?

      (6) In Figure 5, how about the change of apoptosis-related proteins after administration of PKM2 activator TEPP-46?

    2. eLife assessment

      This valuable study demonstrates that AARS2 is crucial for protecting cardiomyocytes from ischemic stress by shifting energy metabolism towards glycolysis through PKM2, presenting a novel therapeutic target for myocardial infarction. The findings are supported by solid evidence, including cardiomyocyte-specific genetic modifications, functional assays, and ribosome profiling, which together robustly validate the AARS2-PKM2 signaling pathway's role in cardiac protection.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors aimed to elucidate the role of AARS2, an alanyl-tRNA synthase, in mouse hearts, specifically its impact on cardiac function, fibrosis, apoptosis, and metabolic pathways under conditions of myocardial infarction (MI). By investigating the effects of both deletion and overexpression of AARS2 in cardiomyocytes, the study aims to determine how AARS2 influences cardiac health and survival during ischemic stress.

      The authors successfully achieved their aims by demonstrating the critical role of AARS2 in maintaining cardiomyocyte function under ischemic conditions. The evidence presented, including genetic manipulation results, functional assays, and mechanistic studies, robustly supports the conclusion that AARS2 facilitates cardiomyocyte survival through PKM2-mediated metabolic reprogramming. The study convincingly links AARS2 overexpression to improved cardiac outcomes post-MI, validating the proposed protective AARS2-PKM2 signaling pathway.

      This work may have a significant impact on the field of cardiac biology and ischemia research. By identifying AARS2 as a key player in cardiomyocyte survival and metabolic regulation, the study opens new avenues for therapeutic interventions targeting this pathway. The methods used, particularly the cardiomyocyte-specific genetic models and ribosome profiling, are valuable tools that can be employed by other researchers to investigate similar questions in cardiac physiology and pathology.

      Understanding the metabolic adaptations in cardiomyocytes during ischemia is crucial for developing effective treatments for MI. This study highlights the importance of metabolic flexibility and the role of specific enzymes like AARS2 in facilitating such adaptations. The identification of the AARS2-PKM2 axis adds a new layer to our understanding of cardiac metabolism, suggesting that enhancing glycolysis can be a viable strategy to protect the heart from ischemic damage.

      Strengths:

      (1) Comprehensive Genetic Models: The use of cardiomyocyte-specific AARS2 knockout and overexpression mouse models allowed for precise assessment of AARS2's role in cardiac cells.

      (2) Functional Assays: Detailed phenotypic analyses, including measurements of cardiac function, fibrosis, and apoptosis, provided evidence for the physiological impact of AARS2 manipulation.

      (3) Mechanistic Insights: This study used ribosome profiling (Ribo-Seq) to uncover changes in protein translation, specifically highlighting the role of PKM2 in metabolic reprogramming.

      (4) Therapeutic Relevance: The use of the PKM2 activator TEPP-46 to reverse the effects of AARS2 deficiency presents a potential therapeutic avenue, underscoring the practical implications of the findings.

      Weaknesses:

      (1) Species Limitation: The study is limited to mouse and rat models, and while these are highly informative, further validation in human cells or tissues would strengthen the translational relevance.

      (2) Temporal Dynamics: The study does not extensively address the temporal dynamics of AARS2 expression and PKM2 activity during the progression of MI and recovery, which could offer deeper insights into the timing and regulation of these processes.

    4. Reviewer #3 (Public Review):

      In the present study, the author revealed that cardiomyocyte-specific deletion of mouse AARS2 exhibited evident cardiomyopathy with impaired cardiac function, notable cardiac fibrosis, and cardiomyocyte apoptosis. Cardiomyocyte-specific AARS2 overexpression in mice improved cardiac function and reduced cardiac fibrosis after myocardial infarction (MI), without affecting cardiomyocyte proliferation and coronary angiogenesis. Mechanistically, AARS2 overexpression suppressed cardiomyocyte apoptosis and mitochondrial reactive oxide species production, and changed cellular metabolism from oxidative phosphorylation toward glycolysis in cardiomyocytes, thus leading to cardiomyocyte survival from ischemia and hypoxia stress. Ribo-Seq revealed that AARS2 overexpression increased pyruvate kinase M2 (PKM2) protein translation and the ratio of PKM2 dimers to tetramers that promote glycolysis. Additionally, PKM2 activator TEPP-46 reversed cardiomyocyte apoptosis and cardiac fibrosis caused by AARS2 deficiency. Thus, this study demonstrates that AARS2 plays an essential role in protecting cardiomyocytes from ischemic pressure via fine-tuning PKM2-mediated energy metabolism, and presents a novel cardiac protective AARS2-PKM2 signaling during the pathogenesis of MI. This study provides some new knowledge in the field, and there are still some questions that need to be addressed in order to better support the authors' views.

      (1) WGA staining showed obvious cardiomyocyte hypertrophy in the AARS2 cKO heart. Whether AARS affects cardiac hypertrophy needs to be further tested.

      (2) The authors observed that AARS2 can improve myocardial infarction, and whether AARS2 has an effect on other heart diseases.

      (3) Studies have shown that hypoxia conditions can lead to mitochondrial dysfunction, including abnormal division and fusion. AARS2 also affects mitochondrial division and fusion and interacts with mitochondrial proteins, including FIS and DRP1, the authors are suggested to verify.

      (4) The authors only examined the role of AARS2 in cardiomyocytes, and fibroblasts are also an important cell type in the heart. Authors should examine the expression and function of AARS2 in fibroblasts.

      (5) Overexpression of AARS2 can inhibit the production of mtROS, and has a protective effect on myocardial ischemia and H/ R-induced injury, and the occurrence of iron death is also closely related to ROS, whether AARS protects myocardial by regulating the occurrence of iron death?

      (6) Please revise the English grammar and writing style of the manuscript, spelling and grammatical errors should be excluded.

      (7) Recent studies have shown that a decrease in oxygen levels leads to an increase in AARS2, and lactic acid rises rapidly without being oxidized. Both of these factors inhibit oxidative phosphorylation and muscle ATP production by increasing mitochondrial lactate acylation, thereby inhibiting exercise capacity and preventing the accumulation of reactive oxygen species ROS. The key role of protein lactate acylation modification in regulating oxidative phosphorylation of mitochondria, and the importance of metabolites such as lactate regulating cell function through feedback mechanisms, i.e. cells adapt to low oxygen through metabolic regulation to reduce ROS production and oxidative damage, and therefore whether AARS2 in the heart also acts in this way.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Little is known about the local circuit mechanisms in the preoptic area (POA) that regulate body temperature. This carefully executed study investigates the role of GABAergic interneurons in the POA that express neurotensin (NTS). The principal finding is that GABA-release from these cells inhibits neighboring neurons, including warm-activated PACAP neurons, thereby promoting hyperthermia, whereas NTS released from these cells has the opposite effect, causing a delayed activation and hypothermia. This is shown through an elegant series of experiments that include slice recordings alongside matched in vivo functional manipulations. The roles of the two neurotransmitters are distinguished using a cell-type-specific knockout of Vgat as well as pharmacology to block GABA and NTS receptors. Overall, this is an excellent study that is noteworthy for revealing local circuit mechanisms in the POA that control body temperature and also for highlighting how amino acid neurotransmitters and neuropeptides released from the same cell can have opposing physiologic effects. I have only minor suggestions for revision.

      Reviewer #2 (Public Review):

      Summary:

      The study has demonstrated how two neurotransmitters and neuromodulators from the same neurons can be regulated and utilized in thermoregulation.

      The study utilized electrophysiological methods to examine the characteristics and thermoregulation of Neurotensin (Nts)-expressing neurons in the medial preoptic area (MPO). It was discovered that GABA and Nts may be co-released by neurons in MPO when communicating with their target neurons.

      Strengths:

      The study has leveraged optogenetic, chemogenetic, knockout, and pharmacological inhibitors to investigate the release process of Nts and GABA in controlling body temperature.

      The findings are relevant to those interested in the various functions of specific neuron populations and their distinct regulatory mechanisms on neurotransmitter/neuromodulator activities

      Weaknesses:

      Key points for consideration include:

      (1) The co-release of GABA and Nts is primarily inferred rather than directly proven. Providing more direct evidence for the release of GABA and the co-release of GABA and Nts would strengthen the argument. Further in vitro analysis could strengthen the conclusion regarding this co-releasing process.

      Measurement of Nts concentrations in various brain regions during thermoregulatory responses is part of a future study.

      (2) The differences between optogenetic and chemogenetic methods were not thoroughly investigated. A comparison of in vitro results and direct observation of release patterns could clarify the mechanisms of GABA release alone or in conjunction with Nts under different stimulation techniques.

      A comparison of chemogenetic and optogenetic stimulation methods is not within the scope of this study.

      (3) Neuronal transcripts were mainly identified through PCR, and alternative methods like single-cell sequencing could be explored.

      Single cell transcriptomics of preoptic neurotensinergic neurons will be part of a different study.

      (4) In Figure 6, the impact of GABA released from Nts neurons in MPO on CBT regulation appears to vary with ambient temperatures, requiring a more detailed explanation for better comprehension.

      The different possible roles of GABA in different thermoregulatory circumstances is discussed on lines 555-581.

      (5) The model should emphasize the key findings of the study.

      The model is presented in Fig 8.

      Reviewer #3 (Public Review):

      Summary:

      Understanding the central neural circuits regulating body temperature is critical for improving health outcomes in many disease conditions and in combating heat stress in an ever-warming environment. The authors present important and detailed new data that characterizes a specific population of POA neurons with a relationship to thermoregulation. The new insights provided in this manuscript are exactly what is needed to assemble a neural network model of the central thermoregulatory circuitry that will contribute significantly to our understanding of regulating the critical homeostatic variable of body temperature. These experiments were conducted with the expertise of an investigator with career-long experience in intracellular recordings from POA neurons. They were interpreted conservatively in the appropriate context of current literature.

      The Introduction begins with "Homeotherms, including mammals, maintain core body temperature (CBT) within a narrow range", but this ignores the frequent hypothermic episodes of torpor that mice undergo triggered by cold exposure. Although the author does mention torpor briefly in the Discussion, since these experiments were carried out exclusively in mice, greater consideration (albeit speculative) of the potential for a role of MPO Nts neurons in torpor initiation or recovery is warranted. This is especially the case since some 'torpor neurons' have been characterized as PACAP-expressing and a population of PACAP neurons represent the target of MPO Nts neurons.

      Additional discussion of a possible role of neurotensinergic neurons in the initiation or recovery from torpor is included (lines 593-597).

    1. eLife assessment

      This study provides compelling data that defines the structure of the S. cerevisiae APC/C. The structure reveals overall conservation of its mechanism of action compared to the human APC/C but some important differences that indicate that activation by co-activator binding and phosphorylation are not identical to the human APC/C. Thus this study will be of considerable value to the field, although the conclusions regarding the effect of phosphorylation would be strengthened by quantification of the phosphopeptides. Recent work on the role of APC7 in APC/C activity in neurones should also be discussed with respect to the mode of action of the APC/C in human versus budding yeast cells.