10,000 Matching Annotations
  1. Aug 2025
    1. Reviewer #2 (Public review):

      Summary:

      The authors investigate the dependence of phage adsorption rates on host metabolic state, using 5 coliphages that differ in their infection cycles and host receptors. They find that four of the 5 phages showed significantly reduced infection under low metabolic states, with phages that generally have weaker adsorption being more strongly affected by low metabolism. The authors complement their findings with a 2-step infection model where phages can disengage from their hosts after initial adsorption. The paper illustrates the power of standardized experimental protocols for quantitative trait comparisons and highlights the dependence of phage infection success on host physiology.

      Strengths:

      The paper is well written and clearly structured.

      The experiments are well-designed, and particularly commendable is the diligent use of control scenarios to allow for quantitative comparison between phages. This standardized protocol will be valuable for the entire phage community.

      The authors convincingly show the impact of host physiology on phage adsorption success. This dependence has so far mainly been considered for intracellular phage replication, and the paper shows that host physiology has to be taken into account at all steps of phage infection.

      Weaknesses:

      There are some concerns about the experimental setup and which conclusions can be drawn from it:

      Before phage infection, bacterial cultures are grown to exponential growth, washed, and then resuspended with glucose or arsenate-azide for 10min. It is however, questionable that 10 minutes is enough to simulate high and low metabolic states realistically. 10 minutes seems to be quite short to go from exponential growth to a low metabolic state, given the transcriptional memory of previous environments. It seems more likely that the population will be quite heterogeneous, with cells in various states of transition towards low metabolic states.

      Given that arsenate and azide inhibit cellular metabolism, i.e., have antimicrobial effects, cells might not just downregulate metabolism but also activate the stress response, and this causes some of the observed effects on phage adsorption. Therefore, the 'low metabolic state' of the cells in this paper could mean that cells are starved or that they are stressed or both.

      The abundance of receptors could change between the high and low metabolic media conditions and contribute to the observed differences in adsorption, while the authors seem to assume in their model that the initial adsorption rate always remains the same.

    2. Reviewer #3 (Public review):

      Summary:

      Marantos et al. showed that for some coliphages, the energetic state of the bacterial host cell has a strong impact on whether phage infection is initiated. The authors drew this conclusion from the observation that there are more free phages remaining in the medium after infection of arsenate-azide-treated cells as compared to after infection of untreated cells. These data were analyzed and reported both as ratios of the treated vs. untreated conditions and using a mass-action kinetic model of phage-cell collision in the infection mixture. The data supported the findings that for four phages infecting Escherichia coli bacteria, namely, phages λ, 𝜙80, m13, and T6, the phages are less likely to initiate infection if the host bacteria are energy-depleted. However, for phage T5, the authors found that their infection propensity is not impacted.

      Strengths:

      The data presented by the authors clearly supported the principal conclusion of the study ("Viral commitment to infection depends on host metabolism"). The five phages chosen by the authors represent different viral lifestyles and infection mechanisms, highlighting the potential applicability to other Escherichia coli phages. Finally, the authors successfully used a classic mass-action model of phage-cell collision to interpret their data. The simplicity of their experimental assay, combined with the use of this mathematical model, offers other investigators who study phage-bacterial interactions in other contexts a potentially useful toolkit to examine infection in general, and specifically, the dependence of phage infection on the host's metabolic state.

      Weaknesses:

      (1) The authors isolated and measured the numbers of free phages in the medium after infection of bacteria under different treatments. These measurements were analyzed in two different ways: (1) simply as ratios (corrected/normalized using different controls), and (2) fitted using a simple mathematical model. I have concerns regarding both analyses.

      1.1) For the first method, having different time points at which the sample of each phage is collected critically complicates data interpretation. As one incubates the phage-bacteria mixture for a longer time, more infection occurs, and the number of phages collected from the mixture decreases. Therefore, the different incubation time forfeits the goal of "a systematic and quantitative comparison across different phages [...]" (line 81), just as the authors self-criticized. Conceivably, the authors could have used the shortest measurement time for all phages (i.e., 10 minutes, as for phage λ). Alternatively, the authors could have applied a systematic criterion such as half (or any other fraction) of the latent period of each phage, which would still "maximize the incubation period while ensuring that manipulations were completed before the first infection cycle concluded" (lines 126-127). In my view, the seemingly arbitrary measurement time for each phage renders the entire first analysis very challenging to interpret. It also goes against the author's proposition that the protocol was "standardized" (line 92) or "consistent" (line 200). It is not clear what the readers are supposed to take away from this first analysis, or rather, which evidence, finding, or conclusion the manuscript would lose if the authors only presented the modeling-based analysis.

      1.2) The second method of analysis sought to remove the dependence of the measurements on time. I completely agree with this goal, and the findings extracted from this analysis significantly contributed to the merits of this manuscript. However, the authors achieved this goal using a single time point for each phage to calculate the infection rate (η). As shown in Figure S3, each of the phage depletion curves is anchored by only one data point (note that the P(t)/P(0) = 1 at t = 0 is assumed, not measured). This goes against the typical way this collision model is used in the literature, where a time series is measured and used to fit the model (e.g., DOI 10.1007/978-1-60327-164-6 18, or more recently, PMID 39700139). This practice in the current manuscript reduced the robustness of the inferred η values. This problem is exacerbated by assumptions used by the authors in formulating this model. For instance, the authors used a constant value for the bacterial concentration, B, because "bacterial growth and lysis were negligible" (lines 135-136). However, considering that the bacteria were cultured at 37oC in a very rich medium (first in YT broth, then in 2% glucose), the measurement times of 20, 30, and 55 minutes are most likely one or a few generations of bacterial growth and division.

      Related note: I suggest that one of the panels in Figure S3 should be moved to the main text, since it is critical to the second method of analysis.

      (2) The data were able to distinguish phages that successfully infected bacteria and those that remained free in the medium, and the authors appropriately interpreted the data as such throughout the Results section. However, in the Discussion (starting from the very first sentence, line 172), the authors used terms that include "adsorption" and "entry" more interchangeably (for example, see the three sentences in lines 310-313, for "viral entry efficiency is shaped by [...]", then "adsorption kinetics modeling"). I do not see how the authors' data could distinguish between adsorption (the phage particles attaching to the outside of the cell) and entry (the phage DNA being injected into the cell). Conceivably, any phage particles that irreversibly attach to a cell but do not yet inject their genome into the cell would still be removed from the medium and therefore not quantified. Another example: in lines 189-191, the authors interpreted that "[...] when the bacterium is in a low metabolic state, the phage does not bind irreversibly to the host", but how do the authors eliminate the case of no phage binding (i.e., the reversible step) to begin with? Similarly, in lines 283-293, how do the authors delineate whether energy depletion would increase the k_off term or decrease the k_inj term, because either would result in more free phages in the medium as observed in the data? I believe that the writing of the Discussion, as it stands now, is doing a disservice to the conclusions presented in the Results section.

      (3) The authors presented an argument that performing infection of all five phages in the same condition is an advantage, allowing for comparison across different phages. While this goal is a completely valid one, it is difficult to reconcile that with the fact that different phages require different optimal conditions for successful infection. For instance, phage T5 famously requires Ca2+ for successful infection into the host bacterium (and later successful replication); see PMID 13174489. However, all infections were performed in TMG, which lacks Ca2+. Perhaps the absence of T5 dependence on the host metabolism is because the infection condition used by the authors was not optimal for T5 to begin with? Similar arguments could be made for other phages.

      (4) Whereas the manuscript examined five coliphages, only phage T5 and phage λ were discussed extensively. I believe some discussion points for these two phages need clarification.

      4.1) Phage T5: The data obtained by the authors show that the infection rate of phage T5 is not impacted by the metabolic state of the host cell. Considering that the authors used the terms "infection", "adsorption", and "entry" interchangeably to refer to the irreversible commitment of a phage to a host cell (see point 2), this discussion regarding phage T5 lacks one critical literature context: DNA entry of phage T5 is known to occur in two phases (first-step transfer and second-step transfer). Critically, the second step can only occur if phage proteins encoded by the phage DNA transferred in the first step are expressed (see PMID 10577483 and the cited papers therein). In that context, metabolic poisoning of the host bacteria should have impeded T5 infection. The authors should comment on this point.

      4.2) Phage λ: The experiment using phage λ in this current study shares many resemblances to that in Brown et al. 2022. That feature alone is not a problem, but at many places in the text, the writing is ambiguous as to whether it is discussing the results in Brown et al. 2022 or in the current manuscript. I am giving three examples below, but this is not exhaustive: (i) Lines 67-69, there is no Brown et al. 2022 reference immediately after "a mutant phage variant (λh) could bypass this dependency [...]" (not just in the previous sentence); (ii) Line 228 should clearly say "Our previous findings suggested that phage λ is capable of [...]", since it concerns Brown et al., 2022, not the current study; and (iii) Lines 245-246, there is no Brown et al., 2022 reference immediately after "we observed that a mutant variant [...] even energy-depleted host" (without a reference, it reads like the authors "observed" that finding in this current manuscript).

      Also, regarding phage λ: The discussion between line 230 and line 249 is very interesting, but since it concerns the differences between λ PaPa and Ur-λ, the authors should consider mentioning and discussing a very relevant recent study, PMCID: PMC6312755.

      (5) Control experiments, or references to prior studies, are needed to support that the As/Az treatment at this concentration and duration (at least 10 minutes) is sufficient to deplete the metabolic state of the cell. For instance, this can be shown by impeded or null cell growth, arrested motility (using a standard swimming assay), or a fluorescent reporter for the energetic state of the cell.

    1. eLife Assessment

      Zandvoort and colleagues describe respiration-brain coupling in the context of apnoea in human newborns. The authors have addressed an important question and supported their claims with solid data. The rigor of the findings could perhaps be further strengthened with some relatively minor changes to the analysis methodology.

    2. Reviewer #1 (Public review):

      Summary:

      The authors investigated the extent to which phase-amplitude coupling (PAC) of respiratory and electrophysiological brain activity recordings was related to episodes of life-threatening apnoea in human newborns.

      Strengths:

      I want to commend the authors for acquiring unique and illuminating data; the difficulty in recording and handling these data has to be appreciated. As far as I can tell, Zandvoort and colleagues are the first to provide robust evidence for respiration-brain coupling in newborns. Their creative use of the phase-slope index for peripheral-central interactions is innovative and credible. If proven to be robust, the authors' findings have important implications well beyond the field of brain-body research.

      Weaknesses:

      While the analyses were overall competently conducted and well-justified, I was not entirely convinced by a few methodological choices, specifically i) the computation of PAC surrogates, ii) details of the linear mixed-effects model, and iii) the electrode selection for linking phase-amplitude coupling to apnoea frequency.

    3. Reviewer #2 (Public review):

      Summary:

      The author's central hypothesis was that the strength of cortico-respiratory coupling in infants is negatively associated with apnoea rate. To prove this, they first investigated the existence of cortico-respiratory coupling in premature and term-born infants, the spatial localisation of the cortical activity and its relationship with the phase of the respiratory cycle, and the directionality of coupling.

      Strengths:

      The researchers used synchronised EEG and impedance pneumography to detect the phase amplitude coupling.

      They have studied a wide range of gestations, from 28 weeks to 42 weeks, including males and females. Their exclusion criteria ensured that healthy babies were studied and potential confounders of impaired respiratory activity were avoided. Their sequential approach in addressing the objectives was appropriate.

      Weaknesses:

      As a neonatal clinician and neuroscientist, I have commented based on my expertise. I have not commented on signal processing.

      I did not identify any major weaknesses in the study. Some minor weaknesses include:

      (1) Data relating to the cortical oscillations and the respiratory phase is given. However, whether this would lead to their hypothesis that the strength of cortico-respiratory coupling is negatively associated with apnoea rate is unclear. What preceding data enabled the authors to link the strength of coupling to the rate of apnoea?

      (2) If we did not know of data showing the existence of cortico-respiratory coupling in newborn infants, then should it not be the first research question to examine?

      (3) What are the characteristics of the infants who contributed data to establish the cortico-respiratory coupling (Figures 2 and 3)?

      (4) Although it is the most plausible direction of the relationship, with neural activation driving respiratory muscle contraction, how can the authors prove this with their data? Given that they show coherence between signals, how do we know that the cortical signal precedes the respiratory muscle contraction?

      (5) Apgar score is an ordinal variable. The authors should summarise this as median (range).

    4. Reviewer #3 (Public review):

      Summary:

      This is a strong and important report that presents a framework for understanding cortical contributions to neonatal respiration. Overall, the authors successfully achieved their goal of linking cortical activity to respiratory drive. Despite the correlational nature of this study, it is a crucial step in establishing a foundation for future work to elucidate the interaction between cortical activity and breathing.

      Strengths:

      (1) The introduction and use of workflows that establish correlational relationships between breathing and brain activity.

      (2) The execution of these workflows in human neonates.

      Weaknesses:

      Interpretations related to causal inference, confounds of sleep and caffeine, and the spatial interpretation of EEG data need to be addressed to ensure that the data appropriately support the conclusions.

    5. Author response:

      We would like to thank the reviewers for their helpful comments and critique of our manuscript. We plan to make the following revisions, which will improve the clarity of our manuscript and the robustness of our findings.

      We will revise methodological details and interpretation throughout the manuscript. In particular, we will consider alternative methods for calculating surrogates. We intend to investigate the relationship between apnoea rate and phase-amplitude coupling at other electrodes as suggested by Reviewer 1, and we will revise the details of the linear-mixed effects models.

      In relation to the comments raised by both Reviewers 2 and 3, we will carefully address the wording throughout the manuscript, including addressing the order of hypotheses, our interpretation of the directionality of the relationship between cortical and respiratory activity, and the connection between cortical-respiratory coupling and apnoea. We will further clarify the limitations of our recording setup and approach, in particular the limited EEG montage, and add further details with regards to sleep state and caffeine.

    1. eLife Assessment

      This study presents valuable and compelling evidence that β-glucan-induced trained immunity can protect against intestinal inflammation by reprogramming innate immune cells toward a reparative phenotype. The authors employ a convincing combination of functional assays, adoptive transfers, and single-cell transcriptomics to uncover mechanistic insights and demonstrate the therapeutic potential of innate immune memory in IBD. While the work is robust, addressing the underlying epigenetic mechanisms and including additional controls would further reinforce the trained immunity-specific interpretation.

    2. Reviewer #1 (Public review):

      Summary:

      This study presents an interesting investigation into the role of trained immunity in inflammatory bowel disease, demonstrating that β-glucan-induced reprogramming of innate immune cells can ameliorate experimental colitis. The findings are novel and clinically relevant, with potential implications for therapeutic strategies in IBD. The combination of functional assays, adoptive transfer experiments, and single-cell RNA sequencing provides comprehensive mechanistic insights. However, some aspects of the study could benefit from further clarification to strengthen the conclusions.

      Strengths:

      (1) This study elegantly connects trained immunity with IBD, demonstrating how β-glucan-induced innate immune reprogramming can mitigate chronic inflammation.

      (2) Adoptive transfer experiments robustly confirm the protective role of monocytes/macrophages in colitis resolution.

      (3) Single-cell RNA sequencing provides mechanistic depth, revealing the expansion of reparative Cx3cr1⁺ macrophages and their contribution to epithelial repair.

      (4) The work highlights the therapeutic potential of trained immunity in restoring gut homeostasis, offering new directions for IBD treatment.

      Weaknesses:

      While β-glucan may exert its training effect on hematopoietic stem cells, performing ATAC-seq on HSCs or monocytes to profile chromatin accessibility at antibacterial defense and mucosal repair-related genes would further validate the trained immunity mechanism. Alternatively, the authors could acknowledge this as a study limitation and future research direction.

    3. Reviewer #2 (Public review):

      Summary:

      The study investigates whether β-glucan (BG) can reprogram the innate immune system to protect against intestinal inflammation. The authors show that mice pretreated with BG prior to DSS-induced colitis experience reduced colitis severity, including less weight loss, colon damage, improved gut repair, and lowered inflammation. These effects were independent of adaptive immunity and were linked to changes in monocyte function.

      The authors show that the BG-trained monocytes not only help control inflammation but confer non-specific protection against experimental infections (Salmonella), suggesting the involvement of trained immunity (TI) mechanisms. Using single-cell RNA sequencing, they map the transcriptional changes in these cells and show enhanced differentiation of monocytes into reparative CX3CR1⁺ macrophages. Importantly, these protective effects were transferable to other mice via adoptive cell transfer and bone marrow transplantation, suggesting that the innate immune system had been reprogrammed at the level of stem/progenitor cells.

      Overall, this study provides evidence that TI, often associated with heightened inflammatory programs, can also promote tissue repair and resolution of inflammation. Moreover, this BG-induced functional reprogramming can be further harnessed to treat chronic inflammatory disorders like IBD.

      Strengths:

      (1) The authors use advanced experimental approaches to explore the potential therapeutic use of myeloid reprogramming by β-glucan in IBD.

      (2) The authors follow a data-to-function approach, integrating bulk and single-cell RNA sequencing with in vivo functional validation to support their conclusions.

      (3) The study adds to the growing evidence that TI is not a singular pro-inflammatory program, but can adopt distinct functional states, including anti-inflammatory and reparative phenotypes, depending on the context.

      Weaknesses:

      (1) The epigenetic and metabolic basis of TI is not explored, which weakens the mechanistic claim of TI. This is especially relevant given that a novel reparative, anti-inflammatory TI program is proposed.

      (2) The absence of a BG-only group limits interpretation of the results. Since the authors report tissue-level effects such as enhanced mucosal repair and transcriptional shifts in intestinal macrophages (colonic RNA-Seq), it is important to rule out whether BG alone could influence the gut independently of DSS-induced inflammation.<br /> Without a BG-only control, it is hard to distinguish a true trained response from a potential modulation caused directly by BG.

      (3) Although monocyte transfer experiments show protection in colitis, the fate of the transferred cells is not described (e.g., homing or differentiation into Cx3cr1⁺ macrophage subsets). This weakens the link between specific monocyte subsets and the observed phenotype.

      (3) While scRNA-seq reveals distinct monocyte/macrophage subclusters (Mono1-3..), their specific functional roles remain speculative. The authors assign reparative or antimicrobial functions based on transcriptional signatures, but do not perform causal experiments (depletion or in vitro assays). The biological roles of these cells remain correlative.

      (4) While Rag1⁻/⁻ mice were used to rule out adaptive immunity, the potential role of innate lymphoid cells (ILCs), particularly ILC2s and ILC3s, which are known to promote mucosal repair (PMID: 27484190), was not explored. Given the reparative phenotype observed, the contribution of ILCs remains a confounding factor.

    4. Reviewer #3 (Public review):

      Summary:

      In the present work, Yinyin Lv et al offer evidence for the therapeutic potential of trained immunity in the context of inflammatory bowel disease (IBD). Prior research has demonstrated that innate cells pre-treated (trained) with β-glucan show an enhanced pro-inflammatory response upon a second challenge.

      While an increased immune response can be beneficial and protect against bacterial infections, there is also the risk that it will worsen symptoms in various inflammatory disorders. In the present study, the authors show that mice preconditioned with β-glucan have enhanced resistance to Staphylococcus aureus infection, indicating heightened immune responses.

      The authors demonstrate that β-glucan training of bone marrow hematopoietic progenitors and peripheral monocytes mitigates the pro-inflammatory effects of colitis, with protection extending to naïve recipients of the trained cells.

      Using a dextran sulfate sodium (DSS)-induced model of colitis, β-glucan pre-treatment significantly dampens disease severity. Importantly, the use of Rag1^-/- mice, which lack adaptive immune cells, confirms that the protective effects of β-glucan are mediated by innate immune mechanisms. Further, experiments using Ccr2^-/- mice underline the necessity of monocyte recruitment in mediating this protection, highlighting CCR2 as a key factor in the mobilization of β-glucan-trained monocytes to inflamed tissues. Transcriptomic profiling reveals that β-glucan training upregulates genes associated with pattern recognition, antimicrobial defense, immunomodulation, and interferon signaling pathways, suggesting broad functional reprogramming of the innate immune compartment. In addition, β-glucan training induces a distinct monocyte subpopulation with enhanced activation and phagocytic capacity. These monocytes exhibit an increased ability to infiltrate inflamed colonic tissue and differentiate into macrophages, marked by increased expression of Cx3cr1. Moreover, among these trained monocyte and macrophage subsets, other gene expression signatures are associated with tissue and mucosal repair, suggesting a role in promoting resolution and regeneration following inflammatory insult.

      Strengths:

      (1) Overall, the authors present a mechanistically insightful investigation that advances our understanding of trained immunity in IBD.

      (2) By employing a range of well-characterized murine models, the authors investigate specific mechanisms involved in the effects of β-glucan training.

      (3) Furthermore, the study provides functional evidence that the protection conferred by the trained cells persists within the hematopoietic progenitors and can be transferred to naïve recipients. The integration of transcriptomic profiling allows the identification of changes in key genes and molecular pathways underlying the trained immune phenotype.

      (4) This is an important study that demonstrates that β-glucan-trained innate cells confer protection against colitis and promote mucosal repair, and these findings underscore the potential of harnessing innate immune memory as a therapeutic approach for chronic inflammatory diseases.

      Weaknesses:

      However, FPKM is not ideal for between-sample comparisons due to its within-sample normalization approach. Best practices recommend using raw counts (with DESeq2) for more robust statistical inference.

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This study presents an interesting investigation into the role of trained immunity in inflammatory bowel disease, demonstrating that β-glucan-induced reprogramming of innate immune cells can ameliorate experimental colitis. The findings are novel and clinically relevant, with potential implications for therapeutic strategies in IBD. The combination of functional assays, adoptive transfer experiments, and single-cell RNA sequencing provides comprehensive mechanistic insights. However, some aspects of the study could benefit from further clarification to strengthen the conclusions.

      We are grateful for the reviewer’s positive assessment of our study and constructive suggestions to improve the manuscript.

      Strengths:

      (1) This study elegantly connects trained immunity with IBD, demonstrating how β-glucan-induced innate immune reprogramming can mitigate chronic inflammation.

      (2) Adoptive transfer experiments robustly confirm the protective role of monocytes/macrophages in colitis resolution.

      (3) Single-cell RNA sequencing provides mechanistic depth, revealing the expansion of reparative Cx3cr1⁺ macrophages and their contribution to epithelial repair.

      (4) The work highlights the therapeutic potential of trained immunity in restoring gut homeostasis, offering new directions for IBD treatment.

      Weaknesses:

      While β-glucan may exert its training effect on hematopoietic stem cells, performing ATAC-seq on HSCs or monocytes to profile chromatin accessibility at antibacterial defense and mucosal repair-related genes would further validate the trained immunity mechanism. Alternatively, the authors could acknowledge this as a study limitation and future research direction.

      We agree that further epigenetic profiling—such as ATAC-seq analysis on HSCs or monocytes—would provide additional mechanistic depth to our current findings. We will acknowledge this as a limitation of the present study and highlight it as an important direction for future research.

      Comment (1): It’s better to include a schematic summarizing the proposed mechanism for reader clarity.

      We agree that a visual summary will enhance the clarity and accessibility of our findings. We will add a new schematic diagram (Figure 6) illustrating the proposed mechanism of β-glucan–induced myeloid reprogramming and its protective effects in the experimental colitis model.

      Comment (2): Discuss potential off-target effects of β-glucan-induced trained immunity (e.g., risk of exacerbated inflammation in other contexts).

      We appreciate this important comment regarding the potential off-target effects of β-glucan pretreatment. As trained immunity is known to amplify inflammatory responses upon heterologous stimulation and has been implicated in chronic inflammation–prone conditions such as atherosclerosis, this is an important consideration. Previous in vivo studies have shown that β-glucan pretreatment can enhance antibacterial or antitumor responses without inducing basal inflammation after one week of administration (PMID: 22901542, PMID: 30380404, PMID: 36604547, PMID: 33125892). Nevertheless, it remains possible that β-glucan–induced trained immunity could have unintended effects in certain contexts, which warrants further investigation and caution. We will expand the Discussion section to include a dedicated paragraph addressing these potential off-target effects.

      Reviewer #2 (Public review):

      Summary:

      The study investigates whether β-glucan (BG) can reprogram the innate immune system to protect against intestinal inflammation. The authors show that mice pretreated with BG prior to DSS-induced colitis experience reduced colitis severity, including less weight loss, colon damage, improved gut repair, and lowered inflammation. These effects were independent of adaptive immunity and were linked to changes in monocyte function.

      The authors show that the BG-trained monocytes not only help control inflammation but confer non-specific protection against experimental infections (Salmonella), suggesting the involvement of trained immunity (TI) mechanisms. Using single-cell RNA sequencing, they map the transcriptional changes in these cells and show enhanced differentiation of monocytes into reparative CX3CR1<sup>+</sup> macrophages. Importantly, these protective effects were transferable to other mice via adoptive cell transfer and bone marrow transplantation, suggesting that the innate immune system had been reprogrammed at the level of stem/progenitor cells.

      Overall, this study provides evidence that TI, often associated with heightened inflammatory programs, can also promote tissue repair and resolution of inflammation. Moreover, this BG-induced functional reprogramming can be further harnessed to treat chronic inflammatory disorders like IBD.

      Strengths:

      (1) The authors use advanced experimental approaches to explore the potential therapeutic use of myeloid reprogramming by β-glucan in IBD.

      (2) The authors follow a data-to-function approach, integrating bulk and single-cell RNA sequencing with in vivo functional validation to support their conclusions.

      (3) The study adds to the growing evidence that TI is not a singular pro-inflammatory program, but can adopt distinct functional states, including anti-inflammatory and reparative phenotypes, depending on the context.

      We are grateful for the reviewer’s positive assessment of our study and recognition of its translational implications. We particularly appreciate the acknowledgment that our work expands the therapeutic potential of β-glucan–mediated trained immunity in ameliorating colitis.

      Weaknesses:

      (1) The epigenetic and metabolic basis of TI is not explored, which weakens the mechanistic claim of TI. This is especially relevant given that a novel reparative, anti-inflammatory TI program is proposed.

      We appreciate the reviewer’s valuable comment highlighting the importance of the epigenetic and metabolic basis of TI in providing mechanistic insight. While previous studies, including work from our group (S.-C. Cheng), have extensively characterized the epigenetic and metabolic signatures of monocytes from BG-trained mice—primarily in the context of inflammatory genes—we acknowledge that these aspects are not directly addressed in our current manuscript.

      To strengthen the mechanistic component, we plan to: 1. Reanalyze relevant public datasets, focusing on pathways related to reparative and antibacterial function. 2. Perform monocyte ATAC-seq in our current model to validate the epigenetic changes in these pathways.

      (2) The absence of a BG-only group limits interpretation of the results. Since the authors report tissue-level effects such as enhanced mucosal repair and transcriptional shifts in intestinal macrophages (colonic RNA-Seq), it is important to rule out whether BG alone could influence the gut independently of DSS-induced inflammation.

      Without a BG-only control, it is hard to distinguish a true trained response from a potential modulation caused directly by BG.

      We thank the reviewer for this important suggestion. Although we did not perform qPCR for mucosal repair genes in Figure S1C and Figure S1D, our colon RNA-seq analysis in Figure 5G included a BG-only control group (Colitis_d0). The results from this group indicate that BG preconditioning alone does not alter baseline expression of colon mucosal repair genes, supporting the conclusion that the observed effects occur in the context of DSS-induced inflammation.

      (3) Although monocyte transfer experiments show protection in colitis, the fate of the transferred cells is not described (e.g., homing or differentiation into Cx3cr1⁺ macrophage subsets). This weakens the link between specific monocyte subsets and the observed phenotype.

      (4) While scRNA-seq reveals distinct monocyte/macrophage subclusters (Mono1-3.), their specific functional roles remain speculative. The authors assign reparative or antimicrobial functions based on transcriptional signatures, but do not perform causal experiments (depletion or in vitro assays). The biological roles of these cells remain correlative.

      We agree that the functional role of CX3CR1<sup>+</sup> macrophages is not comprehensively validated and is currently inferred from scRNA-seq clustering. While our flow cytometry data show increased CX3CR1<sup>+</sup> macrophages in the BG-TI group, and our CCR2 KO and monocyte adoptive transfer experiments indicate these macrophages are monocyte-derived, we lack direct depletion experiments due to the unavailability of effective depletion antibodies for this subset.

      We acknowledge this as a limitation and will clarify in the Discussion that our conclusions regarding CX3CR1<sup>+</sup> macrophage function are based on transcriptional profiling and association with protective phenotypes, rather than direct causal evidence.

      (5) While Rag1<sup>-/-</sup> mice were used to rule out adaptive immunity, the potential role of innate lymphoid cells (ILCs), particularly ILC2s and ILC3s, which are known to promote mucosal repair (PMID: 27484190IF: 7.6 Q1 IF: 7.6 Q1 IF: 7.6 Q1 IF: 7.6 Q1 IF: 7.6 Q1 IF: 7.6 Q1 ), was not explored. Given the reparative phenotype observed, the contribution of ILCs remains a confounding factor.

      We appreciate the reviewer’s valuable comment regarding the potential role of ILCs in the observed mucosal repair. Indeed, in examining the BG-trained immunity effect, the contribution of ILCs was not evaluated. We will explicitly acknowledge in the Discussion that Rag1⁻/⁻ mice retain ILCs (including ILC3s) and that BG-induced activation of these cells remains possible.

      The literature (PMID: 21502992; PMID: 32187516) supports a role for ILC3-mediated IL-22 production in tissue repair, which could overlap with our observed effects. However, our monocyte adoptive transfer experiments show that monocytes alone can alleviate DSS-induced colitis, suggesting a dominant role for monocytes in this context. Nonetheless, we will make it clear that ILC contributions cannot be excluded.

      Reviewer #3 (Public review):

      Summary:

      In the present work, Yinyin Lv et al offer evidence for the therapeutic potential of trained immunity in the context of inflammatory bowel disease (IBD). Prior research has demonstrated that innate cells pre-treated (trained) with β-glucan show an enhanced pro-inflammatory response upon a second challenge.

      While an increased immune response can be beneficial and protect against bacterial infections, there is also the risk that it will worsen symptoms in various inflammatory disorders. In the present study, the authors show that mice preconditioned with β-glucan have enhanced resistance to Staphylococcus aureus infection, indicating heightened immune responses.

      The authors demonstrate that β-glucan training of bone marrow hematopoietic progenitors and peripheral monocytes mitigates the pro-inflammatory effects of colitis, with protection extending to naïve recipients of the trained cells.

      Using a dextran sulfate sodium (DSS)-induced model of colitis, β-glucan pre-treatment significantly dampens disease severity. Importantly, the use of Rag1<sup>-/-</sup> mice, which lack adaptive immune cells, confirms that the protective effects of β-glucan are mediated by innate immune mechanisms. Further, experiments using Ccr2<sup>-/-</sup> mice underline the necessity of monocyte recruitment in mediating this protection, highlighting CCR2 as a key factor in the mobilization of β-glucan-trained monocytes to inflamed tissues. Transcriptomic profiling reveals that β-glucan training upregulates genes associated with pattern recognition, antimicrobial defense, immunomodulation, and interferon signaling pathways, suggesting broad functional reprogramming of the innate immune compartment. In addition, β-glucan training induces a distinct monocyte subpopulation with enhanced activation and phagocytic capacity. These monocytes exhibit an increased ability to infiltrate inflamed colonic tissue and differentiate into macrophages, marked by increased expression of Cx3cr1. Moreover, among these trained monocyte and macrophage subsets, other gene expression signatures are associated with tissue and mucosal repair, suggesting a role in promoting resolution and regeneration following inflammatory insult.

      Strengths:

      (1) Overall, the authors present a mechanistically insightful investigation that advances our understanding of trained immunity in IBD.

      (2) By employing a range of well-characterized murine models, the authors investigate specific mechanisms involved in the effects of β-glucan training.

      (3) Furthermore, the study provides functional evidence that the protection conferred by the trained cells persists within the hematopoietic progenitors and can be transferred to naïve recipients. The integration of transcriptomic profiling allows the identification of changes in key genes and molecular pathways underlying the trained immune phenotype.

      (4) This is an important study that demonstrates that β-glucan-trained innate cells confer protection against colitis and promote mucosal repair, and these findings underscore the potential of harnessing innate immune memory as a therapeutic approach for chronic inflammatory diseases.

      We thank the reviewer for their positive evaluation and constructive feedback on our manuscript.

      Weaknesses:

      However, FPKM is not ideal for between-sample comparisons due to its within-sample normalization approach. Best practices recommend using raw counts (with DESeq2) for more robust statistical inference.

      We appreciate the reminder about best practices for RNA-seq analysis. We apologize for the inaccurate description in the Materials and Methods section. For all differential expression analyses, we have in fact used raw count data as input for DESeq2. FPKM values were only used for visualization purposes, such as in heatmaps and clustering analyses. We will correct this description in the revised manuscript to accurately reflect our analysis workflow.

    1. eLife Assessment

      The study by Takagi and colleagues is an important contribution to the question of how homologous neuronal circuits might be wired differently to elicit specific behaviours. The authors combine genetic, neuroanatomical, and behavioral data to provide convincing evidence that Dfz2/DWnt4 signaling controls the innervation pattern of wave command neurons in the fly larva, and thereby behavioral locomotion program selection.

    2. Reviewer #1 (Public review):

      Summary

      In this study Takagi and colleagues demonstrate that changes in axonal arborization of the segmental wave motor command neurons are sufficient to change behavioral motor output.

      The authors identify the Wnt receptors DFz2 and DFz4 and the ligand Wnt4 as modulators of the stereotypic segmental arborization pattern of segmental wave neurons along the anterior-posterior body axis. Based on both embryonic expression pattern analysis and genetic manipulation of the signaling components in wave neurons (receptors) and the neuropil (Wnt4) the authors convincingly demonstrate that Wnt4 acts as a repulsive ligand for DFz2 that restricts posterior axon guidance of both anterior and posterior wave neurons. They also provide first evidence that Wnt4 potentially acts as an attractive ligand for Df4 to promote posterior extension of p-wave neurons. Interestingly, artificial optogenetic activation of all wave neurons that normally induces a backward locomotion due to the activity of anterior wave neurons, fails to induce backward locomotion in a DFz2 knock down condition with altered axonal extensions of all wave neurons towards posterior segments. In addition, the authors now observe enhanced fast forward locomotion a feature normally induced by posterior wave neurons. Consistent with these findings, they observe that the natural response to an anterior tactile stimulus is similarly altered in DFz2 knock down animals. The animals respond with less backward movement and increase fast forward motion. These results suggest that alterations in the innervation pattern of wave motor command neurons are sufficient to switch behavioral response programs.

      Strengths

      The authors convincingly demonstrate the importance of Wnt signaling for anterior-posterior axon guidance of a single class of motor command neurons in the larval CNS. The demonstration that alteration of the expression level of a single axon guidance receptor is sufficient to not only alter the innervation pattern but to significantly modify the behavioral response program of the animal provides a potential entry point to understand behavioral adaptations during evolution.

      Weaknesses

      The authors demonstrate an alteration of the behavioral response to a natural tactile stimulus and correlate this to morphological alterations observed in the single-neuron analyses. As the authors suggest an alteration of the command circuitry, a direct observation of the downstream activation pattern in response to selective optogenetic stimulation of anterior wave neurons (if possible with appropriate genetic tools in the future) would further strengthen their claims.

    3. Reviewer #2 (Public review):

      Summary:

      In the manuscript, the authors aim to determine the molecular mechanisms involved in wiring the segmentally homologous a- and p -Wave neurons distinctively and thus are functionally different in modulating forward or backward locomotion. The genetic screen focused on Wnt/Fz-signaling due to its known anterior-to-posterior guidance roles in mammals and nematodes.

      Strengths:

      The conclusion that Frizzled receptors DFz2 and DFz4 as well as the DWnt4 ligand is essential for normal segment-specific axon projections of Wave command neurons is strongly supported by the elaborate morphological analyses of numerous Wnt/Fz in gain and loss of function mutants. The distinctive Wnt/Fz ligand-receptor gradients also imply that they contribute to the diversification of Wave neurons in a location-dependent manner and that DFz2 and DFz4 may have opposing effects on axon extension.

      Labeling of synaptic marker Bruchpilot in DFz2 mutants in this revised manuscript, now supports that the ectopic projections in a-Wave neurons make synaptic connections. Finally, the altered responses in two behavioral assays (optogenetic stimulation of all Wave neurons or tactile stimuli on heads using a von Frey filament) further strongly support the main conclusion, that Wnt/Fz-signaling is essential for the guidance of both Wave neurons and in diversifying their protection pattern in a segment-specific manner.

      Weaknesses:

      There are no major weaknesses in the revised version of this work.

      Re-analysis of DFz2 expression now shows it is bidirectionally distributed. This new result does not affect the previous and current conclusions for the a-Wave neurons but leaves alternative interpretations for p-Wave neurons, which the author now included in their discussions. Evidently, it seems unlikely that the complex wiring of the numerous segmental a- and p-Wave neurons will be solely dependent on Wnt4-DFz2/4 but are likely to also involve other Wnt/Fz (see, Figure 1-figure supplement 2) or distinct guidance signaling pathways. However, unraveling all factors involved is certainly beyond the scope of this study, and the main conclusions made by the authors are well supported by the data provided.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary

      In this study, Takagi and colleagues demonstrate that changes in axonal arborization of the segmental wave motor command neurons are sufficient to change behavioral motor output.

      The authors identify the Wnt receptors DFz2 and DFz4 and the ligand Wnt4 as modulators of stereotypic segmental arborization patterns of segmental wave neurons along the anterior-posterior body axis. Based on both embryonic expression pattern analysis and genetic manipulation of the signaling components in wave neurons (receptors) and the neuropil (Wnt4) the authors convincingly demonstrate that Wnt4 acts as a repulsive ligand for DFz2 that restricts posterior axon guidance of both anterior and posterior wave neurons. They also provide the first evidence that Wnt4 potentially acts as an attractive ligand for Df4 to promote the posterior extension of p-wave neurons. Interestingly, artificial optogenetic activation of all wave neurons that normally induces backward locomotion due to the activity of anterior wave neurons, fails to induce backward locomotion in a DFz2 knockdown condition with altered axonal extensions of all wave neurons towards posterior segments. In addition, the authors now observe enhanced fast-forward locomotion, a feature normally induced by posterior wave neurons. Consistent with these findings, they observe that the natural response to an anterior tactile stimulus is similarly altered in DFz2 knockdown animals. The animals respond with less backward movement and increased fast forward motion. These results suggest that alterations in the innervation pattern of wave motor command neurons are sufficient to switch behavioral response programs.

      Strengths

      The authors convincingly demonstrate the importance of Wnt signaling for anteriorposterior axon guidance of a single class of motor command neurons in the larval CNS. The demonstration that alteration of the expression level of a single axon guidance receptor is sufficient to not only alter the innervation pattern but to significantly modify the behavioral response program of the animal provides a potential entry point to understanding behavioral adaptations during evolution.

      Weaknesses

      While the authors demonstrate an alteration of the behavioral response to a natural tactile stimulus the observed effects, a reduction of backward motion and increased fast-foward locomotion, currently cannot be directly correlated to the morphological alterations observed in the single-neuron analyses. The authors do not report any loss of innervation in the "normal" target region but only a small additional innervation of more posterior regions. An analysis of synaptic connectivity and/or a more detailed morphological analysis that is supported by a larger number of analyzed neurons both in control and experimental animals would further strengthen the confidence of the study. As the authors suggest an alteration of the command circuitry, a direct observation of the downstream activation pattern in response to selective optogenetic stimulation of anterior wave neurons would further strengthen their claims (analogous to Takagi et al., 2017, Figure 4).

      We sincerely thank the reviewer for their insightful comments, which were instrumental in improving our manuscript. In response to the reviewers’ suggestion, we have now studied Brp expression and demonstrate that the ectopically extending Wave axons in the posterior region do contain synapses (new Figure 2). This finding supports the idea that these axons are functionally connected to ectopic downstream circuits. 

      Additionally, we have increased the number of analyzed Wave clones in Figure 1F-J (WT and DFz2 KD) and new Figure 3C-G (WT; formerly Figure 2C-G) to strengthen the morphological analyses. We fully agree with the reviewer that “direct observation of the downstream activation pattern in response to selective optogenetic stimulation” would further reinforce our conclusions. However, this was not feasible in the current study since we found that the Wave-Gal4 driver used in this study, which drives expression during embryonic stages, does not drive sufficiently strong expression in the larvae to enable selective optogenetic stimulation (please see below for details). 

      Reviewer #2 (Public Review):

      Summary:

      The authors previously demonstrated that anterior-located a-Wave neurons (neuromeres A1-A3) extend axons anteriorly to connect to circuits inducing backward locomotion, while p-Wave axon (neuromeres A4-A7) project posteriorly to promote forward locomotion in Drosophila larvae. In the manuscript, the authors aim to determine the molecular mechanisms involved in wiring the segmentally homologous Wave neurons distinctively and thus are functionally different in modulating forward or backward locomotion. The genetic screen focused on Wnt/Fz-signaling due to its known anterior-to-posterior guidance roles in mammals and nematodes.

      Strengths:

      Knock-down (KD) DFz2 with two independent RNAi-lines caused ectopic posterior axon and dendrite extension for all a- and p-Wave neurons, with a-Wave axon extending into regions where p-Wave axons normally project. Both behavioral assays (optogenetic stimulation of all Wave neurons or tactile stimuli on heads using a von Frey filament) show that backward movement is reduced or absent and that the speed of evoked fast-forward locomotion is increased. This demonstrates that altered projections of Wave do alter behavior and the DFz2 KD phenotype is consistent with the potential aberrant wiring of a-Wave neurons to forward locomotion-promoting circuits instead of to backward locomotion-promoting circuits.

      The main conclusion, that Wnt/Fz-signaling is essential for the guidance of Wave neurons and in diversifying their protection pattern in a segment-specific manner, is further supported by the results showing that DFz2 gain of function causes shortening of a-Wave but not p-Wave axon extensions towards the posterior end and that KD of DFz4 causes axonal shortening only in A6-p-Wave neurons but does not affect dendrites or processes of other Wave neurons. A role for ligand Wnt4 is demonstrated by results indicating that WNT4 mutants' posterior extension of aWave axons was elongated similar to DFz2 KD animals and p-Wave axon extension towards the posterior end was shortened similar to DFz2 KD animals. Finally, a DWnt4 gradient decreasing from the posterior (A8) to the anterior end (A2), similar to that described in other species, is supported by analyses of DWnt4 gene expression (using Wnt4 Trojan-Gal4) and protein expression (using antibodies). In contrast, DFz2 receptor levels seemed to decrease from the anterior (A2) to the posterior end (A5/6). Together the results support the conclusion that opposing Wnt/Fz ligand-receptor gradients contribute to the diversification of Wave neurons in a location-dependent manner and that DFz2 and DFz4 have opposing effects on axon extension.

      Weaknesses:

      Wave axon and dendrite projections are not exclusively determined by Wnt4, DFz2, and DFz4, and are likely to involve other Fz receptors, Wt ligands, and other types of receptor-ligand signaling pathways. This is in part supported by the fact that Wnt4 loss of function also resulted in phenotypes that do not mimic DFz2 KD or DFz4 KD (Figures 3D, E, and F) and that other Fz/Wnt mutants caused wave neuron phenotypes (Figure 1-supplement 2, D+E). This is not a weakness per se, since it doesn't affect the main conclusion of the manuscript. However, the description and analyses of the data in particular for Figure 1-supplement 2 D should be clarified in the legend. The number within the bars and the asterisks are not defined. It's presumed they refer to numbers of animals assessed and the asterisk next to DFz2 and DFz4 indicate statistically significant differences. However, only one p-value is provided in the legend. It is also unclear if p-values for the other mutants have not been determined or are non-significant. At least for mutants like Corin, which also exhibit altered axon projections, the p-values should be provided.

      We appreciate this reviewer’s careful attention to detail and intellectual curiosity. We apologize for the confusions caused by the statistical reporting in Figure 1 – figure supplement 2D. The numbers shown in the bars represent the number of neurons (i.e. Wave neurons from left or right hemisphere). As mentioned in Materials and Methods section, we applied Chi-square test followed by Haberman's adjusted residual analysis to determine the statistical significance of each RNAi group. The p-value provided in the figure legend corresponds to the Chi-square test. P-values for Haberman's adjusted residual analysis were calculated for all RNAi groups and groups without the asterisk are not statistically significant. We have clarified these points in the corresponding figure legend.

      Figure 4 D, F. The gradient for Wnt4 was determined by comparison of expression levels of other segments to A8 but the gradient for DFz2 was by comparison to A2 and the data supports opposing gradients. However, for DFz2 (Figure 4, F) it seems that the gradient is bi-directional with the lowest being in A5 and increasing towards A2 as well as A8. Analysis should be performed in reference to A8 as well to determine if it is indeed bi-directional. While such a finding would not affect the interpretation of aWave neurons, it may impact conclusions about p-Wave neuron projections.

      We thank the reviewer for highlighting this interesting possibility. In response, we performed an additional analysis of the DFz2 gradient by comparing the signal from each neuromere to that from A8 (new Figure 5—figure supplement 3). This analysis confirmed that the gradient is indeed bidirectional. We revised the description of DFz2 expression accordingly in the revision. We believe this finding does not affect our main conclusions since only the anterior gradient is relevant for a-Wave axon guidance. 

      As discussed above, the DFz2 KD phenotypes are consistent with the potential aberrant wiring of a-Wave neurons to forward locomotion-promoting circuits instead of to backward locomotion-promoting circuits. However, since the axon and dendrites of a-Wave and p-Wave are affected the actual dendritic and axonal contributions for the altered behavior remain elusive. The authors certainly considered a potential contribution of altered dendrite projection of a-Wave neurons to the phenotype and their conclusion that altered axonal projections are involved is supported by the optogenetic experiment "bypassing" sensory input (albeit it seems unlikely that all Wave neurons are activated simultaneously when perceiving natural stimuli).However, the author should also consider that altered perception and projection of pWave neuron may directly (e.g. extended P-wave axon projections increase forward locomotion input thereby overriding backward locomotion) or indirectly (e.g. feedback loops between forward and backward circuits) contribute to the altered behavioral phenotypes in both assays. It is probably noteworthy that the more complex behavioral alterations observed with mechanical stimulation are likely to also be caused by altered dendritic projections.

      We fully agree with the reviewer’s thoughtful interpretation. We have now included these important possibilities in the revised Discussion section. Specifically, we acknowledge that while the DFz2 knockdown phenotypes are consistent with aberrant wiring of a-Wave neurons to forward locomotion-promoting circuits, the contributions of both axonal and dendritic alterations remain unclear. We also recognize that altered perception and projection of p-Wave neurons may directly or indirectly contribute to the observed behavioral phenotypes, particularly in response to mechanical stimulation.

      Presynaptic varicosities of a-Wave neurons in DFz2 KD animals are indicated by orange arrows in Figure 1. However, no presynaptic markers have been used to confirm actual ectopic synaptic connections. At least the authors should more clearly define what parameters they used to "visually" define potential presynaptic varicosities. Some arrows seem to point to more "globular structures" but for several others, it's unclear what they are pointing at.

      As mentioned in our response to Reviewer #1, we have now performed Brp immunostaining to confirm the presence of ectopic synaptic connections (new Figure 2). This analysis supports the interpretation that the presynaptic varicosities observed in DFz2 knockdown animals represent actual synaptic sites. We also clarified in the figure legend the visual criteria used to identify potential presynaptic varicosities.

      Reviewing Editor (Recommendations For The Authors):

      There are a few major concerns that we recommend the authors address:

      (1) Neuroanatomy: The point aberrant synaptic connectivity of a-Wave neurons following Dfz2 knockdown could be substantiated. This could be done by using a presynaptic marker and showing ectopic posterior presynaptic sites ( and/or reduced anterior presynaptic sites) in a-wave neurons.

      As mentioned in our response to the public review, we now have used Brp as a presynaptic marker to quantify the number and distribution of presynaptic sites along the normal and ectopic a-Wave axons (new Figure 2). We show that ectopic posterior Wave axons do contain presynaptic sites.  

      (2) Gradient calculations: As detailed in the reviews below, the Dfz2 gradient looks like it may be bidirectional. Changing the way the gradient is calculated might help address this point.

      As mentioned in our response above, we now have recalculated the gradient by comparing the DFz2 signal to A8 and show that it indeed is bidirectional (new Figure 5—figure supplement 2; formerly Figure 4—figure supplement 2).

      (3)  Statistics and sample sizes: As detailed in the reviews, some of the statistical reporting could be improved. Further, increasing sample sizes could help bolster confidence in the data as well.

      As mentioned above, we have added a description on the sample size, asterisks, and p-values in Figure 1 – figure supplement 2 legend. We also increased sample sizes of single Wave neurons in control and DFz2 knock-down animals (Figure 1F-J (WT and DFz2 KD) and new Figure 3C-G (WT; formerly Figure 2C-G)).

      (4) It would help to include some discussion of the potential contributions of altered p-wave neurons to the observed phenotypes.

      As described above, we have added in the Discussion potential contributions of altered p-wave neurons to the observed phenotypes. 

      Reviewer #1 (Recommendations For The Authors):

      (1) In the current model the authors assume that posterior elongation of a-wave neuron connectivity (axonal projections) induces a loss of connectivity to their natural targets, as backward motion is no longer induced, and a gain of connectivity to posterior wave neuron targets. Is this at the cost of innervation of p-wave neurons, e.g. did these neurons now lose connectivity to their natural targets as well? Therefore, it would be very interesting if the authors would test the behavioral responses to tactile stimuli in the posterior parts of the animal - does the response pattern change?

      This is indeed an interesting possibility that p-Wave function is altered upon DFz2 knock-down and hence behavioral response to posterior touch is changed. However, it is technically challenging to test this with tactile stimuli, due to the difficulty of (1) distinguishing between normal and fast-forward locomotion and (2) delivering a posterior touch stimulus while the larva is moving forward, which is the default behavior of the larvae on an agar plate.

      As highlighted above, the authors should provide additional evidence that the circuit response to a-wave neurons is changed after a DFz2 knockdown. The authors should monitor the activation wave in response to optogenetic activation of anterior wave neurons - analogous to the data provided in Figure 4 of their 2017 paper. If this response is now switched for a-wave activation but not p-wave activation it would greatly support their claims and this data would be less ambiguous compared to the behavioral locomotion data.

      As described in our response to the public review, we attempted this approach but found that the in vitro optogenetics experiment is unfortunately not feasible due to relatively weak expression of R60G09-GAL4 in the larvae. Local activation of control aWave induced fictive backward locomotion only at low frequencies, making comparison with the experimental a-Wave very difficult.  The MB120B-spGAL4 used in our 2017 study could not be employed in this study as it does not drive expression during the embryonic stages and thus cannot be used to knock down DFz2 during development. 

      (2) Related to this point. Why would the normal "backward" circuitry of a-wave neurons be functionally suppressed in Dfz2 knockdowns? Do the authors observe reduced synaptic connectivity in these segments? Vesicle clustering of synaptotagmin or other presynaptic markers could be used as a first. As the innervation pattern is only extended by approximately one segment, it is surprising that the changes are so significant.

      We agree that these are important and interesting points, which remain to be explored in the future study. As described above, we have performed Brp immunostaining and showed that the posterior ectopic axons of a-Wave do contain synapses (new Figure 2). We also found a slight decrease in the number of synapses in the anterior region, which could partially contribute to the weaker activation of downstream neurons responsible for eliciting backward locomotion. Another possibility is that backward suppression occurs through lateral interaction among downstream circuits. Since forward and backward locomotion do not occur simultaneously, it is likely that the circuits driving these two behaviors are mutually inhibitory. Upon DFz2 knock down in a-Wave, downstream neurons inducing fastforward locomotion may become more strongly activated than those inducing backward locomotion, resulting in inhibition of the latter via a “winner-take-all” mechanism. Since these discussions are highly speculative, we chose not to include them in the revised manuscript.  

      (3) The low number of neurons analyzed per segment is of slight concern. This is particularly the case for the control data set used in Figure 1 and Figure 2. As stated, the same datasets are used for both figures. However, at most 6 neurons were analyzed (and for two segments only 3). The control morphology may be more variable than indicated by this data.

      As mentioned above, we now have dissected 50 larvae each for the control and experimental groups, obtained seven and six clones respectively, and included these data in the revised manuscript. We apologize that the sample sizes are still relatively small but hope the reviewer understands the inherently low “hit rate” of the stochastic labelling method.

      It is somewhat curious that in Figure 1- Supplement 3 the authors report the same number of control clones per segment as in Figure 1/2 - is this simply a coincidence? And if this is an independent dataset why did the author use new controls here but not for Figure 2? It is clear that it is very difficult to generate this data but increasing the n-number beyond 3-6 per segment would significantly increase the confidence in the presented data.

      We apologize for the confusion. The data in Figure 1 – figure supplement 3 represent the innervation pattern of dendrites, not axons. We have corrected the figure caption accordingly. These data were obtained from the same samples used to analyze axonal innervation, as shown in the original version of Figure 1F-J.

      (2) The name of the RNAi lines should be indicated in Figure 1 and Figure Supplement 3 to facilitate reading - at least the precise names should be given in both figure legends.

      We have added these labels in the revised figure legends as requested.

      (3) In Figure 4E again the control numbers of Figure 1 for the A2-wave axon are reused. This does not seem appropriate as now a different Gal4 driver is used and a different method to induce individual neuronal clones. Both components may induce significant variability in expression or arborization. As only 3 clones for the wnt4 mutant condition are analyzed (and compared to 5 control clones), this data does not allow for strong conclusions. The authors clearly state the reuse and different methods in the legend of Figure 4 F/G but should also highlight it for the E panel.

      Here, we assume that the reviewer is referring to the former Figure 3 (now Figure 4). We have added a note in the legend that the control data, obtained using a different method, were reused in this panel.

      (4) The expression levels of DWnt4 and DFz2 were analyzed at the end of embryogenesis. At what developmental stage does the axonal extension of wave neurons take place? Is the gradient maintained throughout the first larval stages?

      Based upon the lateral view of Wave neurons in Figure 1—figure supplement 1D, we think that the axonal extension is already established by approximately 20 hr after egg laying. Previously, we performed Wnt4<sup>MI03717-Trojan-GAL4</sup> > GFP.nls immunostaining in the third instar larva and observed a similar gradient of GFP signals towards the posterior end of the ventral nerve cord (VNC). We have included this data in the revised manuscript (new Figure 5—figure supplement 1).

      (5) The authors state that either 2nd or 3rd instar larvae were used for the optogenetic experiments. This may induce unnecessary variation in their assay and should be avoided. As natural variance exists in larvae regarding forward stride duration, the comparison of "on" state forward stride duration between control and experimental genotype is potentially not the best measurement of effect size. What is the difference between OFF and ON stage within the control and experimental genotype? In both cases stride duration decreases but there may not be a significant difference between the delta of the two genotypes. Thus, the observed effect may in part be due to "slower" animals in the control pool. The authors should discuss this more carefully.

      We thank the reviewer for bringing up this critical issue. Indeed, the stride durations of larvae between the control and DFz2 knock-down are slightly different in the OFF condition, although this is not statistically significant. In addition, the effect size of Wave activation on mean stride duration is -0.14 (s) in control while -0.21 (s) in DFz2 knock-down, which we interpret as DFz2 knock-down resulting in stronger fastforward locomotion upon Wave activation. We have incorporated this note in the corresponding figure legends (new Figure 6; formerly Figure 5).

      (6) While the study clearly provides convincing evidence for their model, the authors should tune down their conclusions in the discussion a little bit and highlight that parts of their discussion are speculative.

      We have revised the discussion as suggested.

      Reviewer #2 (Recommendations For The Authors):

      Albeit the optogenetic behavioral experiments strongly support that the altered axonal projection affect normal locomotion, simultaneous labeling of Wave neurons in DFz2 KD animals with presynaptic markers would strengthen the conclusion of ectopic connection of the extended axon with other circuits.

      Please see our response to your public review.

      Figure 1 K+L, Figure 2H, I, Figure 3 F+G: many of the individual data points are not visible in the Whisker plot- changing their color would be useful to visualize them better.

      We have changed the outline width of the box plots to make the individual data points visible.

      Figure 1-Supplement 2: In addition to the comments in the public review- a) the asterisk font size changes in the different panels, e.g. it is much smaller in G', b) font size in some graphs/legends should be increased - in particular in E the hyphenated letters in the genotypes are so small rendering them almost illegible.

      We have unified the font size to make them readable in the figure. We thank the reviewer for the suggestions.

    1. eLife Assessment

      This valuable paper describes the crystal structure of a complex of the Sld3-Cdc45-binding domain (CBD) with Cdc45, which is essential for the assembly of an active Cdc45-MCM-GINS (CMG) double-hexamer at the replication origin. The structural and biochemical analyses of protein-protein interactions and DNA binding provided solid evidence to support the authors' conclusion. The results shown in the paper are of interest to researchers in DNA replication and genome stability.

    2. Reviewer #1 (Public review):

      Summary:

      The crystal structure of the Sld3CBD-Cdc45 complex presented by Li et al. is a significant contribution that enhances our understanding of CMG formation during the rate-limiting step of DNA replication initiation. This structure provides crucial insights into the intermediate steps of CMG formation, and the particle analysis and model predictions compellingly describe the mechanism of Cdc45 loading.<br /> Building upon previously known Sld3 and Cdc45 structures, this study offers new perspectives on how Cdc45 is recruited to MCM DH through the Sld3-Sld7 complex. The most notable finding is the structural rearrangement of Sld3CBD upon Cdc45 binding, particularly the α8-helix conformation, which is essential for Cdc45 interaction and may also be relevant to its metazoan counterpart, Treslin. Additionally, the conformational shift in the DHHA1 domain of Cdc45 suggests a potential mechanism for its binding to Mcm2NTD.<br /> Furthermore, the ssDNA-binding experiments involving Sld3 further support a broader functional role in the replication process, beyond its established role in recruiting Cdc45. This adds an intriguing new layer to our understanding of Sld3's activity in the yeast.

    3. Reviewer #2 (Public review):

      Summary

      The manuscript presents valuable findings, particularly in the crystal structure of the Sld3CBD-Cdc45 interaction and the identification of additional sequences involved in their binding. The modeling of the Sld7-Sld3CBD-CDC45 subcomplex is novel, and the results provide insights into potential conformational changes that occur upon interaction. Although the single-stranded DNA binding data from Sld3 of different species is a minor weakness, the experiments support a model in which the release of Sld3 from the complex may be promoted by its binding to origin single-stranded DNA exposed by the helicase.

    4. Reviewer #3 (Public review):

      Summary:

      The paper by Li et al. describes the crystal structure of a complex of Sld3-Cdc45-binding domain (CBD) with Cdc45 and a model of the dimer of an Sld3-binding protein, Sld7, with two Sld3-CBD-Cdc45 for the tethering. In addition, the authors showed the genetic analysis of the amino acid substitution of residues of Sld3 in the interface with Cdc45 and biochemical analysis of the protein interaction between Sld3 and Cdc45 as well as DNA binding activity of Sld3 to the single-strand DNAs of the ARS sequence.

    5. Author response:

      The following is the authors’ response to the previous reviews

      Reviewer #1 (Public review):

      Summary:

      The crystal structure of the Sld3CBD-Cdc45 complex presented by Li et al. is a significant contribution that enhances our understanding of CMG formation during the rate-limiting step of DNA replication initiation. This structure provides crucial insights into the intermediate steps of CMG formation, and the particle analysis and model predictions compellingly describe the mechanism of Cdc45 loading. Building upon previously known Sld3 and Cdc45 structures, this study offers new perspectives on how Cdc45 is recruited to MCM DH through the Sld3-Sld7 complex. The most notable finding is the structural rearrangement of Sld3CBD upon Cdc45 binding, particularly the α8-helix conformation, which is essential for Cdc45 interaction and may also be relevant to its metazoan counterpart, Treslin. Additionally, the conformational shift in the DHHA1 domain of Cdc45 suggests a potential mechanism for its binding to Mcm2NTD. Furthermore, Sld3's ssDNA-binding experiments provide evidence of its novel functions in the DNA replication process in yeast, expanding our understanding of its role beyond Cdc45 recruitment.

      Strengths:

      The manuscript is generally well-written, with a precise structural analysis and a solid methodological section that will significantly advance future studies in the field. The predictions based on structural alignments are intriguing and provide a new direction for exploring CMG formation, potentially shaping the future of DNA replication research. This research also opens up several new opportunities to utilize structural biology to unravel the molecular details of the model presented in the paper.

      Weaknesses:

      The main weakness of the manuscript lies in the lack of detailed structural validation for the proposed Sld3-Sld7-Cdc45 model, and its CMG bound models, which could be done in the future using advanced structural biology techniques such as single particle cryo-electron microscopy. It would also be interesting to explore how Sld7 interacts with the MCM helicase, and this would help to build a detailed long-flexible model of Sld3-Sld7-Cdc45 binding to MCM DH and to show where Sld7 will lie on the structure. This will help us to understand how Sld7 functions in the complex. Also, future experiments would be needed to understand the molecular details of how Sld3 and Sld7 release from CMG is associated with ssARS1 binding.

      The proposals based on this study provide new knowledge of the CMG formation process. We agree that our Sld3-Sld7-Cdc45 model will be further confirmed by cryo-EM. We improved our ssARS1-binding assay and quantified data (See the response to Recommendations for the authors of #3 review).

      Reviewer #2 (Public review):

      Summary

      The manuscript presents valuable findings, particularly in the crystal structure of the Sld3CBD-Cdc45 interaction and the identification of additional sequences involved in their binding. The modeling of the Sld7-Sld3CBD-CDC45 subcomplex is novel, and the results provide insights into potential conformational changes that occur upon interaction. Although the single-stranded DNA binding data from Sld3 of different species is a minor weakness, the experiments support a model in which the release of Sld3 from the complex may be promoted by its binding to origin single-stranded DNA exposed by the helicase.

      Strengths

      The Sld3CBD-Cdc45 structure is a novel contribution, revealing critical residues involved in the interaction.

      The model structures generated from the crystal data are well presented and provide valuable insights into the interaction sequences between Sld3 and Cdc45.

      The experiments testing the requirements for interaction sequences are thorough and conducted well, with clear figures supporting the conclusions.

      The conformational changes observed in Sld3 and Cdc45 upon binding are interesting and enhance our understanding of the interaction.

      The modeling of the Sld7-Sld3CBD-CDC45 subcomplex is a new and valuable addition to the field.

      The proposed model of Sld3 release from the complex through binding to single stranded DNA at the origin is intriguing.

      Weaknesses

      The section on the binding of Sld3 complexes to origin single-stranded DNA is somewhat weakened by the use of Sld3 proteins from different species. The comparisons between Sld3-CBD, Sld3CBD-Cdc45, and Sld7-Sld3CBD-Cdc45 involve complexes from different species, limiting the comparisons' value.

      Although the study reveals that Sld3 binds to different residues of Cdc45 than those previously shown to bind Mcm or GINS, the data in the paper do not shed any additional light on how GINS and Sld3 binding to Cdc45 or Mcms. would affect each other. Other previous research has suggested that the binding of GINS and Sld3 to Mcm or Cdc45 may be mutually exclusive. The authors acknowledge that a structural investigation of Sld3, Sld7, Cdc45, and MCM during the stage of GINS recruitment will be a significant goal for future research.

      We agree that it is better to use all samples from a source; however, due to limitations in protein expression, we used Sld7-Sld3CBD-Cdc45 from a different source. The two sources used in this study belong to the same family, and the proteins Sld7, Sld3 and Cdc45 share sequence conservation with similar structures predicted by Alphafold3 (RMSD = 0.356, 1.392, and 0.891 for Ca atoms of Sld7CTD, Sld7NTD-Sld3NTD, and Sld3CBD-Cdc45). Such similarity in source and proteins allows us to do the comparison. We also mentioned that a cryo-EM study of Sld3-Sld7-Cdc45-MCM and Sld3-Sld7-CMG structures will be a significant goal for future research in our manuscript.

      Reviewer #3 (Public review):

      Summary:

      The paper by Li et al. describes the crystal structure of a complex of Sld3-Cdc45-binding domain (CBD) with Cdc45 and a model of the dimer of an Sld3-binding protein, Sld7, with two Sld3-CBD-Cdc45 for the tethering. In addition, the authors showed the genetic analysis of the amino acid substitution of residues of Sld3 in the interface with Cdc45 and biochemical analysis of the protein interaction between Sld3 and Cdc45 as well as DNA binding activity of Sld3 to the single-strand DNAs of the ARS sequence.

      Strengths:

      The authors provided a nice model of an intermediate step in the assembly of an active Cdc45-MCM-GINS (CMG) double hexamers at the replication origin, which is mediated by the Sld3-Sld7 complex. The dimer of the Sld3-Sld7 complexes tethers two MCM hexamers together for the recruitment of GINS-Pol epsilon on the replication origin.

      Weaknesses:

      The biochemical analysis should be carefully evaluated with more quantitative ways to strengthen the authors' conclusion even in the revised version.

      In this revision, we improved our ssARS1-binding assay in more quantitative ways (See the response to Recommendations for the authors).

      Reviewer #1 (Recommendations for the authors):

      I thank the authors for all their replies to my previous questions and for doing all the necessary corrections. I am satisfied with most of their replies, however, upon second reading I have a few more suggestions which could help to improve the manuscript further and make an impact in the field. My comments are listed below.

      (1) In general, the manuscript is well structured, but I feel that it requires professional English correction. In many places it was difficult to understand the sentences and I had to read it several times to understand it. Also, very long sentences should be avoided. The flow should be easy to read and understand, and that is why I feel it requires professional English correction.

      Following the comment, we checked English carefully and shortened the very long sentences.

      (2) Page 5, line 103, please include molecule after the word complex to make it like- "Only one complex molecule exists within an asymmetric unit."

      We revised this sentence (P5/L103).

      (3) Line 113- more than the N-terminal half of the protruding long helix α7 113 was disordered in the Sld3CBD-Cdc45 complex. This sentence is not clear. What does it mean more than the N-terminal half? Please rewrite it.

      We revised this sentence to give the corresponding residue number “(D219–H231)” (P5/L114).

      (4) Page 5, result 2- Conformation changes in Sld3CBD and Cdc45 for binding each other, this section may require a little restructuring. Line 130-131- "Therefore, the helix α8CTP seems to be an intrinsically disordered segment when Sld3 alone but 130 folds into a helix coupled to the binding partner Cdc45 in the Sld3CBD-Cdc45 complex." This statement is the crux of the structural finding and therefore, I feel it should move after the first sentence.

      Thank you for your comments. We rewrote this part (P5/L128-131).

      (5) Line 121-122: Compared to the isolated form (PDBIDs: 5DGO 121 for huCdc45 [31] and 6CC2 for EhCdc45 [33]) and the CMG form (PDBID: 3JC6. Write it in the same format. Make 6CC2 in bracket like other PDB IDs. Restructure this sentence.

      We revised this sentence (P5/122-123).

      (6) Line 127-129: This sentence is also not very clear.

      We revised this sentence together with above No (4). (P5/L128-131)

      (7) In my question 4- "Can authors add a supplementary figure showing the probability of disordernes..."., I meant to use a disorder prediction tool like IUPred for the protein sequences and show that α8 is predicted to be a disordered upon sequence analysis. This will help to show the inherent property of α8 helix, and it could add up to the understanding that a disordered region is being structured in the complex structure.

      The structures showed that α8CTP is stabilized by binding with Cdc45, but disordered in Sld3CBD alone, indicating that this part is flexible, like an intrinsically disordered segment. We have deposited the structure to PDB, so predictions like IUPred cannot show meaningful information.

      (8) Question 9 regarding Supplementary Figure 8- Please include your statement in the figure legend - "WT Sld3CBD was prepared in a complex with Cdc45, while the mutants of Sld3CBD existed alone, we calculated the elements of secondary structure from the crystal structure of Sld3CBD-Cdc45. The concentration of samples was controlled to the same level for CD measurement."

      Following the comment, we optimized the figure legend of Supplementary Figure 8.

      (9) Question 13- I understand that negative staining and SEC-SAXS experiments could be very tricky for such protein complexes, which have very long loops and are flexible. Did authors try a GraFix cross-linking before doing the negative staining TEM? If it is not being tried, then it might be a good idea to try it and it may help to get much cleaner particles and easier class averaging. Although I completely understand the technical challenges the authors describe and I agree with them, I still feel that one good experiment that shows this dimer model would be very helpful to strengthen the claim. I am concerned because if people start using a similar DLS experiment to calculate intermolecular distances, citing your paper, in many cases it might be a wrong interpretation. In case the negative staining still does not work, at least discuss your technical challenges in the discussion section and mention that SEC-SAXS showed a similar length of the complex and show the Guinier plot and Porod plots in the supplementary data.

      We believe that DLS is one of the methods for analyzing the single particle size. Of course, the confirmation by multiple methods will give compelling evidence. Following the comment, we added SEC-SAXS data in the [Results] (P7/L194-196) (Cdc45 recruitment to MCM DH by Sld3 with partner Sld7) and Supplementary Figure 11. The Sld7-Sld3-Cdc45 forms a flexible, long shape. Each binding domain is rigid but linked by the long loops. The flexibility problems are caused by the long loop linkers, but not by binding. So, we did not try to use the cross-linking method for analysis experiments.  

      (10) Page 8, line 221- litter sequence specificity: Correct the word "litter" with little. Also, the word shaped is written as sharped at a few places in the manuscript. Please correct it.

      We apologize for making such mistakes. We have modified these words.

      (11) Page 9, line 237-238: Would it be possible to add a lane showing Sld7 binding to the ssDNA in figure 4. I recommend showing this to understand the ssDNA binding affinity of Sld7 by itself and it will also help us to compare when it is in complex with Sld3.

      Considering that Sld7 on CMG is always a complex with Sld3, the ssDNA binding affinity should use the Sld3-Sld7 complex. Additionally, we attempted to overexpress Sld7, but could not obtain the target protein.

      Reviewer #2 (Recommendations for the authors):

      Thank you for the improved manuscript. The following sentence is unclear: "Cdc45 binds tighter to long ssDNA (>60 bases) with a litter sequence specificity".

      We apologize for making such a mistake. We modified “litter” to “little”.

      I found it challenging to understand which species were used while reading the results section and figure legends. I recommend that the authors revise the text in both the results and figure legends to clearly indicate when proteins from different species are being compared. Additionally, it would be valuable to explicitly acknowledge this limitation in the text.

      Following the comment, we added a description for using different species in results (P8/L224-225) and figure legends (Supplementary Figure 14). We added more information in the Methods to explain why we used two species for preparing proteins.

      Reviewer #3 (Recommendations for the authors):

      Major points:

      (1) The current title is not appropriate for the general readers. At least, DNA replication or DNA replication initiation should be added and abbreviations such as CBD should be avoided.

      Following the comment, we added “DNA replication” into the title. Regarding “CBD”, since the full name of “Cdc45 binding domain” is too long, we continue to use Sld3CBD.

      (2) As in my previous review, I asked for quantification of the EMSA assay shown in Figure 4 and Supplemental Figures 13 and 14. Since some signals of the bands are very weak, it is hard to conclude something. Given different protein concentrations used in the experiment, the authors should provide any kinds of value. For example, Sld3CBD-CDC45 shows weaker DNA binding than Sld3CBD alone (line 231). Is this true (or reproducible)? It is hard to conclude without any quantification.

      We have repeated the EMSA assay four or more times with different rods of overexpression, purification and DNA synthesis, indicating that the EMSA assay is reproducible. In this revision, we changed the DNA stain and adjusted the ratio between the protein and ssDNA with increasing concentrations. The smeared bands of ssDNA with Sld7–Sld3ΔC–Cdc45 or Sld7–Sld3ΔC exhibit enhanced discernibility, and the ssDNA bands are intense enough for grayscale calculations (Figure 4 in the second revised version). We used a series of t-tests to confirm a significantly ssDNA residual level between Sld3CBD–Cdc45 to Sld3CBD, Sld7–Sld3ΔC–Cdc45, and Sld7–Sld3ΔCS (t-test, ****: P<0.0001). We also carefully controlled the sample amount in the EMAS assay and described it in the [Methods].

      Moreover, in this EMSA assay (in Figure 4), the authors suggest that the disappearance of ssDNA bands corresponds with the binding of the protein to the DNA. However, it is also possible that the DNA is degraded. It is very important to show the band of protein-DNA complexes on the gel (a whole gel, not the parts of the gel shown in Figure). Why did the authors use this "insensitive" assay using SyberGreen, not radio-labelled ssDNA?

      In this revision, we added a negative control of no ssDNA-binding by using ssARS1-3_3 for all protein samples (Sld3CBD, Sld3CBD–Cdc45, Sld7–Sld3ΔC–Cdc45 and Sld7–Sld3ΔC), which were the same rod of expression and purification for bound to ssARS1s (ssARS1-2 and ssARS1-5) (Figure 4), showing that the disappearance of ssDNA bands is caused by binding to proteins, not degradation. Moreover, this time, by changing the DNA stain and increasing the concentration of the samples, the smeared ssDNA bands exhibit enhanced discernibility in the high molecular weight regions when mixed with Sld7–Sld3ΔC–Cdc45 or Sld7–Sld3ΔC, whereas no bands appeared in the NC (ssARS1-3_1). The positions of smeared ssDNA bonds correspond to those of protein in the protein-stain pages, indicating that ssARS1 were complexed with proteins. Following the comment, we show all bands on the gel in Figure 4 and Supplementary Figure 14. Compared to Sld7–Sld3ΔC–Cdc45 or Sld7–Sld3ΔC, Sld3CBD and ssDNA bonds could not be observed because the pI value of Sld3CBD, which affects the entry of the samples into the gel.

      We agree that using radio-labelled ssDNA can obtain a sensitive binding assay. However, current laboratory constraints did not allow us to use radio-labelled ssDNA. Furthermore, considering the characteristics of our target proteins, Sld3CBD, Sld3CBD–Cdc45, Sld7–Sld3ΔC–Cdc45, and Sld7–Sld3ΔC, we planned to perform the binding assay in a more natural state without any modifications, labelling or linkers. Additionally, we have attempted to use ITC experiments but failed in the measurements. Presumably, the conformational flexibility of Sld7-Sld3-Cdc45 and Sld7-Sld3 caused a thermodynamic anomaly.

      Minor points:

      (1) Line 215, 80b: This should be "80 nucleotides(nt)". Throughout the text, nucleotides is better than base to show the length of ssDNAs.

      Thank you for your comments. We modified these words throughout the text.

    1. eLife Assessment

      This important study provides a description of how single-neuron firing rates in the human medial temporal lobe and frontal cortex are modulated by theta-burst stimulation of the basolateral amydala. The results are supported by convincing evidence obtained from a rigorous task design and analysis of an incredibly rare dataset. The results may help guide future studies incorporating amygdala stimulation to improve patient health.

    2. Reviewer #1 (Public review):

      In this manuscript, Campbell et al. assess how intracranial theta-burst stimulation (TBS) applied to the basolateral amygdala in 23 epilepsy patients affects neuronal spiking in the medial temporal lobe and prefrontal cortex during a visual recognition memory task. This is an incredibly rare dataset; collecting single-unit spiking data from behaving humans during active intracranial stimulation is a Herculean task, with immense potential for translational studies of how stimulation may be applied to modulate biological mechanisms of memory. The authors utilize careful, high quality methodology throughout (e.g. task design, spike recording and sorting, statistical analysis), providing high confidence in the validity of their findings.

      In providing such a detailed and deep investigation into the single-unit responses to intracranial stimulation the authors provide a very useful resources to any researchers in the fields of brain stimulation and human neurophysiology. This work could be instrumental in guiding diverse research studies, from basic science investigating the role of theta oscillations in human cognition to translational work investigating deep-brain stimulation for memory.

      The authors have adequately addressed all prior concerns.

    3. Reviewer #2 (Public review):

      Summary:

      This study presents a valuable characterization of the effects of intracranial theta-burst stimulation of the basolateral amygdala on single units spiking activity in several areas in the human brain, associated with memory processing. It is written clearly and concisely, allowing readers to fully understand the analysis used.

      The authors used a visual recognition memory task previously employed by their group to characterize the effects of basolateral amygdala stimulation upon memory consolidation (Inman et al, 2018). This current report presents an interesting analysis that complements the results reported in the 2018 paper.

      Strengths:

      Rare combination of human neurophysiology and behavior -<br /> The type of experiment performed in the manuscript, which contains both neurophysiological data, behavior, and a deep brain stimulation intervention (DBS), is incredibly rare, takes many years to accomplish with tight collaboration between clinical and research teams. Our understanding of spiking dynamics of human neurons is very limited, and this report is an important piece in the puzzle that allows DBS to be used in future interventions that will benefit patients' health.

      Multiple brain areas included -<br /> It's important to note that the report analyzes brain areas with which the Amygdala has extensive connections (Fig. 1A) - Hippocampus, OFC, Amygdala, ACC. It seems that neurons in all these areas were modulated by the stimulation, except the ACC, in which firing rates were so low that only a handful of neurons were included in the analysis. This is an important demonstration that low-amplitude stimulation (even when reduced to 0.5mA) can travel far and wide across the human brain.

      The experiment is cleverly designed to tease apart responses due to visual stimuli (image presentation) and electrical stimulation. Authors suggest that the units modulated by stimulation are largely distinct from those responsive to image offset during trials without stimulation. The subpopulation that responds strongly also tends to have a higher baseline firing rate. It's important to add that the chosen modulation index is more likely to be significant in neurons with higher firing rates (Figure S8). The authors discuss the tradeoff of using a nonparametric modulation index for vs. other methods (for example, percent change in trial-averaged firing rate from baseline).

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      This is an exploratory study that doesn't explore quite enough. Critically, the authors make a point of mentioning that neuronal firing properties vary across cell types, but only use baseline firing rate as a proxy metric for cell type. This leaves several important explorations on the table, not limited to the following:”

      1a: “Do waveform shape features, which can also be informative of cell type, predict the effect of stimulation?”

      To address this question, we modeled our approach to cell type classification after Peyrache et al. 2012. More specifically, we extracted two features from the mean unit waveforms—the valley-to-peak time (VP) and the peak half-width (PHW). These features were then used to classify units into two distinct clusters (k-means, clusters = 2, based on a strong prior from existing literature), representing putative excitatory and inhibitory neurons. Our approach recapitulated many of the same observations in Peyrache et al. 2012, namely (1) identification of two clusters (low PHW/VP: inhibitory, high PHW/VP: excitatory), (2) an ~80/20 ratio of excitatory/inhibitory neurons, and (3) greater baseline firing rates in the inhibitory vs. excitatory neurons. However, we did not observe a preferential modulation of one cell type compared to another (see newly created Figure 4). A description of this analysis and its takeaways has been incorporated into the manuscript.

      Change to Text:

      Created Figure 4 (Separation of presumed excitatory and inhibitory neurons by waveform morphology).

      Caption: (A) Two metrics were calculated using the averaged waveforms for each detected unit: the valley-to-peak width (VP) and peak half-width (PHW). (B) Scatterplot of the relationship between VP and PHW; note that units with identical metrics are overlaid. Using k-means clustering, we identified two distinct response clusters, representing presumed excitatory (E, blue) and inhibitory (I, red) neurons. The units from which the example waveforms were taken are outlined in black. Probability distributions for each metric are shown along the axes. (C) Total number of units within each cluster, separated by region. (D) Comparison of baseline firing rates, separated by cluster. (E) Percent of modulated units in each cluster. * p < 0.05, NS = not significant.

      Added a description of clustering methodology to lines 132-137: “We calculated two metrics from the averaged waveform from each detected unit: the valley-to-peak-width (VP) and the peak half-width (PHW) (Figure 4A); previously, these two properties of waveform morphology have been used to discriminate pyramidal cells (excitatory) from interneurons (inhibitory) in human intracranial recordings (Peyrache et al., 2012). Next, we performed k-means clustering (n = 2 clusters) on the waveform metrics, in line with previous approaches to cell type classification.

      Added a section in the Results titled “Theta Burst Stimulation Modulates Excitatory and Inhibitory Neurons Equally”. Lines 370-378: “Using k-means clustering, we grouped neurons into two distinct clusters based on waveform morphology, representing neurons that were presumed to be excitatory (E) and inhibitory (I) (Figure 4B). Inhibitory (fast-spiking) neurons exhibited shorter waveform VP and PHW, compared with excitatory (regular-spiking) neurons (I cluster centroid: VP = 0.50ms, PHW = 0.51ms; E cluster centroid: VP = 0.32ms, PHW = 0.31ms), and greater baseline firing rates (U(N<sub>I</sub> = 23, N<<sub>E</sub> = 133) = 1074.50, p = 0.023) (Figure 4D). Although we observed a much greater proportion of excitatory vs. inhibitory neurons (E: 85.3%, I: 14.7%), stimulation appeared to affect excitatory and inhibitory neurons equally, suggesting that one cell type is not preferentially activated over another (Figure 4E).

      Modified discussion of the effects of stimulation on different cell types. Lines 475-483: “…To test these hypotheses directly, we clustered neurons into presumed excitatory and inhibitory neurons based on waveform morphology. In doing so, we observed ~85% excitatory and ~15% inhibitory neurons, which is very similar what has been reported previously in human intracranial recordings (Cowan et al. 2024, Peyrache et al., 2012). Interestingly, stimulation appeared to modulate approximately the same proportion of neurons for each cell type (~30%), despite the differently-sized groups. Recent reports, however, have suggested that the extent to which electrical fields entrain neuronal spiking, particularly with respect to phase-locking, may be specific to distinct classes of cells (Lee et al., 2024).”

      1b:  “Is the autocorrelation of spike timing, which can be informative about temporal dynamics, altered by stimulation? This is especially interesting if theta-burst stimulation either entrains theta-rhythmic spiking or is more modulatory of endogenously theta-modulated units.”

      The reviewer is correct in suggesting that rate-modulation represents only one of many possible ways by which exogenous theta burst stimulation may influence neuronal activity. Indeed, intracranial theta burst stimulation has previously been shown to evoke theta-frequency oscillatory responses in local field potentials (Solomon et al. 2021), and other forms of stimulation (i.e., transcranial alternating current stimulation) may modulate the rhythm, rather than the rate, of neuronal spiking (Krause et al. 2019).

      To investigate whether stimulation altered rhythmicity in neuronal firing, we contrasted the spike timing autocorrelograms, as suggested. More specifically, we computed the pairwise differences in spike timing for each trial, separating spikes into the same pre-, during-, and post-stimulation epochs described in the manuscript (bin size = 5 ms, max lag = 250 ms), grouped neurons by whether they were modulated, and then contrasted the differences in the latencies of the peak normalized autocorrelation value between epochs. Only neurons with a firing rate of ≥ 1 Hz (n = 70/203, 34.5%) were included in this analysis since sparse firing resulted in noisy autocorrelation estimates. Subsequent statistical testing of the peak latency differences between pre-/during- and pre-/post-stimulation did not reveal any group-level differences (Mann-Whitney U tests, p > 0.05). Thus, we were not able to identify neuronal responses suggestive of altered rhythmicity (see Figure S5). A description of this analysis and its takeaways has been incorporated into the manuscript.

      Of note, there are two elements of the data that constrain our ability to detect modulation in the rhythm of firing. First, the baseline activity recorded across neurons modulated by stimulation was relatively low (i.e., median firing rate = 1.77 Hz). Second, stimulation often resulted in a suppression, rather than an enhancement, of firing rate. Taken together, the sparse firing afforded limited opportunity to characterize changes to subtle patterns of spiking. 

      Change to Text:

      Created Figure S5 (Analysis of modulation in spiking rhythmicity)

      Caption: (A) Representative autocorrelograms ACG) for a single neuron. The pairwise differences in spike timing were computed for each trial and epoch (bin size = 5 ms, max lag = 250 ms), then smoothed with a Gaussian kernel. The peak in the normalized ACG across trials was computed for each epoch. (B) Kernel density estimate of the peak ACG lag, separated by epoch. (C) The peak ACG lags were split by whether the neuron was modulated (Mod) or unaffected by stimulation (NS = not significant) for each of the two contrasts: pre- vs. during-stim (left) and pre- vs. post-stim (right).

      Details about the autocorrelation methodology have been incorporated. Lines 166-172: “To investigate whether stimulation altered rhythmicity in neuronal firing, we analyzed the spike timing autocorrelograms. More specifically, we computed the pairwise differences in spike timing for each trial (bin size = 5 ms, max lag = 250 ms) and then contrasted the differences in the latencies of the peak normalized autocorrelation value between epochs (pre-, during-, post-stimulation). Only neurons with a firing rate of ≥ 1 Hz (n = 70/203, 34.5%) were included in this analysis since sparse firing resulted in noisy autocorrelation estimates.

      The results from contrasting the autocorrelograms are now mentioned briefly. Lines 297-298: “Stimulation, however, did not appear to alter the rhythmicity in neuronal firing, as measured by spiking autocorrelograms (Figure S5).”

      1c: “The authors reference the relevance of spike-field synchrony (30-55 Hz) in animal work, but ignore it here. Does spike-field synchrony (comparing the image presentation to post-stimulation) change in this frequency range? This does not seem beyond the scope of investigation here.”

      We agree that a further characterization of spike-field and spike-phase relationships may provide rich insights into more complex regional and interregional dynamics that may be altered by stimulation. Given that many metrics are biased by sample size (e.g., number of spikes), which can vary considerably, computing the pairwise phase consistency (PPC) between spikes and LFP is a preferred metric (Vinck et al. 2010). Although PPC is unbiased, its variance nonetheless increases considerably with low spike counts; pooling spike counts across trials, however, decouples the temporal relationship between spiking and the LFP phase for each trial, confounding results and yielding an unstable estimate.

      To determine whether such an analysis is indeed possible, we calculated the percentage of stimulation trials with ≥ 10 spikes in both the 1s pre- and post-stimulation epochs (a relatively low threshold for inclusion). Only a very small proportion of the total number of trials across all neurons met this criterion (2.5%). Thus, because of the sparse spiking in our data, we are unable to reliably characterize spike-field or spike-phase modulation in detected neurons.

      Change to Text:

      In the manuscript, we have added a description of why our data is not well-suited to investigate these relationships.

      Lines 532-538: “The present study did not investigate interactions between spiking activity and local field potentials because neuronal spiking was sparse at baseline and often further suppressed by stimulation; only a very small proportion of the total number of trials across all neurons exhibited ≥ 10 spikes in both the 1s pre- and post-stimulation epochs (~2.5%). Although certain metrics are not biased by sample size (e.g., pairwise phase consistency), low spike counts can dramatically affect variance and, therefore, result in unstable estimates (Vinck et al., 2011).

      1d: “How does multi-unit activity respond to stimulation? At this somewhat low count of neurons (total n=156 included) it would be valuable to provide input on multi-unit responses to stimulation as well.”

      We thank the reviewer for this suggestion. We have incorporated an analysis of multiunit activity (MUA), which similarly identifies robust modulation via permutation-based statistical testing and characterizes the different profiles of responses (i.e., increased vs. decreased MUA threshold crossings pre- vs. post-stimulation).

      Change to Text:

      Created Figure S8 (Analysis of multiunit activity response to stimulation)

      Caption: (A) Example trace of multiunit activity (MUA) in one channel during a single stimulation trial. Threshold crossings are highlighted with a pink dot overlaid on the MUA signal with a corresponding hash below. (B) The percentage of channels with significantly modulated MUA, separated by the direction of effect. (C) The percentage of channels with significantly modulated MUA, separated by direction effect and region. Inc (red; post > pre) vs. Dec (blue; post < pre). HIP = hippocampus, OFC = orbitofrontal cortex, AMY = amygdala, ACC = anterior cingulate cortex. *** p < 0.001, NS = not significant.

      Details about the MUA methodology have been incorporated. Lines 174-180: “Finally, we measured modulation in multiunit activity (MUA) by filtering the microleectrode signals in a 300-3,000 Hz window and counting the number of threshold crossings. Thresholds were determined on a per-channel basis and defined as -3.5 times the root mean square of the signal during the baseline period; activity during stimulation was excluded since stimulation artifact is difficult to separate from MUA in the absence of spike sorting.

      MUA results are now incorporated. Lines 365-367: “Additional characterization of MUA revealed a dominant signature of increased activity post- vs. pre-stimulation, in line with these trends observed at the single-neuron level (Figure S8).”

      1e: “Several intracranial studies have implicated proximity to white matter in determining the effects of stimulation on LFPs; do the authors see an effect of white matter proximity here?”

      We thank the reviewer for the interesting question. Subsequent characterization revealed only small differences in the proximity of stimulation contacts to white matter (range 1.5-8.0 mm), likely because the chosen target (i.e., basolateral amygdala) has several nearby white matter structures (e.g., stria terminalis). Nonetheless, we performed a linear regression between the proximity to white matter and the stimulation-induced effect on behavior (stimulation vs. no-stimulation d’ difference), the results of which indicate no clear association (p > 0.05; see Figure S9). Critically, this is not to suggest that white matter proximity has no interaction with the reported behavioral effects, but rather, that we could not identify such an association within our data.

      Change to Text:

      Created Figure S9 (The effect of stimulation proximity to white matter and distance to recorded neurons).

      Caption: (A) Kernel density estimate of the Euclidean distance from stimulation contacts to nearest WM structure (in mm); hash marks represent individual observations. (B) The change in memory performance (Δd’) was linearly regressed onto the distance from the stimulated contacts to white matter.

      The following has been added to lines 405-426: “Proximity to white matter has been shown to influence the effects of stimulation on behavior and the strength of evoked responses (Mankin et al., 2021; Mohan et al., 2020; Paulk et al., 2022). Across all stimulated contacts, we observed only small differences in the proximity of stimulation contacts to white matter (median = 4.5 mm, range = 1.5-8.0 mm), likely because the chosen target (i.e., basolateral amygdala) has several nearby white matter structures (e.g., stria terminalis). Nonetheless, we performed a linear regression between the proximity to white matter and the stimulation-induced effect on behavior (stimulation vs. no-stimulation d’ difference), the results of which indicate no clear association (p > 0.05; see Figure S9).

      Comment 2: “It is a little confusing to interpret stimulation-induced modulation of neuronal spiking in the absence of stimulation-induced change in behavior. How do the authors findings tell us anything about the neural mechanisms of stimulation-modulated memory if memory isn't altered? In line with point #1, I would suggest a deeper dive into behavior (e.g. reaction time? Or focus on individual sessions that do change in Figure 4A?) to make a stronger statement connecting the neural results to behavioral relevance.”

      We agree that the connection between the observed stimulation-induced neuronal modulation and effects on behavior is unclear and has proven challenging to elucidate. Per the reviewer’s suggestion, we further focused our analyses on the neuronal modulation effects in the individual sessions that resulted in a robust change in memory performance (stimulation vs. no-stimulation d’ difference threshold of ± 0.5, based on a moderate effect size for Cohen’s d); both a positive and negative threshold were used to capture robust changes in memory performance associated with firing rate modulation, whether enhancement or suppression. To this end, we contrasted the proportion of modulated neurons in the sessions where stimulation resulted in a robust behavioral change (Δd’) with those that did not (~d’). We did not observe a difference in the proportions between groups when collapsed across all sampled regions, or when separately evaluated (Fisher’s exact tests, p > 0.05; see Figure 5C).

      Given that this approach did not further clarify the connection between our neural and behavioral results, we believe it is most appropriate to deemphasize claims in the manuscript regarding the potential insights for behavioral modulation (e.g., memory enhancement), and have done so.

      Change to Text:

      Toned down reference to the memory-related effects of stimulation in the abstract by removing the following lines from the abstract: “Previously, we demonstrated that intracranial theta burst stimulation (TBS) of the basolateral amygdala (BLA) can enhance declarative memory, likely by modulating hippocampal-dependent memory consolidation…” and “…and motivate future neuromodulatory therapies that aim to recapitulate specific patterns of activity implicated in cognition and memory.”

      Changed Figure 4 to Figure 5

      Created Figure 5C (Interaction between behavioral effects and neuronal modulation)(C)  Change in recognition memory performance was split into two categories using a d’ difference threshold of ± 0.5: responder (positive or negative; Δd’, pink) and non-responder (~d’, grey). Individual d’ scores are shown (left) with points colored by outcome category; dotted lines demarcate category boundaries, and the grey-shaded region represents negligible change. The number of sessions within each outcome category (middle) and the proportion of modulated units as a function of outcome category, separated by region (right). NS = not significant.

      The description of the behavioral results has been updated. Lines 394-403: “At the level of individual sessions, we observed enhanced memory (Δd’ > +0.5) in 36.7%, impaired memory (Δd’ < -0.5) in 20.0%, and negligible change (-0.5 ≤ Δd’ ≤ 0.5) in 43.3% when comparing performance between the stim and no-stim conditions; a threshold of Δd’ ± 0.5 was chosen for this classification based on the defined range of a “medium effect” for Cohen’s d. To test our hypothesis that neuronal modulation would be associated with changes in memory performance, we combined the sessions that resulted in either memory enhancement or impairment and contrasted the proportion of modulated units across regions sampled. We did not, however, observe a meaningful difference in the proportion of modulated units when grouped by behavioral outcome (all contrasts p > 0.05) (Figure 5C).

      Lines 213-214 and 394-397 have been edited to reflect a change in the d’ threshold used for categorizing behavioral results (from Δd’ ± 0.2 to Δd’ ± 0.5).

      Comment 3: “It is not clear to me why the assessment of firing rates after image onset and after stim offset is limited to one second - this choice should be more theoretically justified, particularly for regions that spike as sparsely as these.”

      We thank the reviewer for this question and acknowledge that no clear justification was provided for this decision in the manuscript. Our decision to limit each of the analysis epochs to 1s was chosen for two reasons. First, the maximum possible length of the during-stimulation epoch was 1 s (stim on for 1 s). Although the pre- and post-stimulation epochs could be extended without issue, we were concerned that variable time windows could introduce a bias, for instance, resulting in different variances between epochs. Second, we anticipated, both from empirical observations and prior literature, that the neural response following stimulation or task features (e.g., image onset/offset) was likely to be transient, rather than sustained for a period of many seconds. By keeping the windows short, we ensured that our approach to detecting modulation (i.e., contrasting trial-wise spike counts between each pair of epochs) captured the intended effect rather than random noise. We have incorporated a discussion of this rationale in the Peri-Stimulation Modulation Analyses section.

      Change to Text:

      Lines 156-158 have been added: “Each epoch was constrained to 1 s to ensure that subsequent firing rate contrasts were unbiased and to capture potential transient effects (e.g., image onset/offset).”

      Comment 4: “This work coincides with another example of human intracranial stimulation investigating the effect on firing rates (doi: https://doi.org/10.1101/2024.11.28.625915). Given how incredibly rare this type of work is, I think the authors should discuss how their work converges with this work (or doesn't).”

      Thank you for bringing this highly relevant work to our attention. We were unaware of this recent preprint and have incorporated a discussion of its main findings into the manuscript.

      Change to Text:

      New citations: van der Plas et al. 2024 (bioRxiv), Cowan et al. 2024 (bioRxiv)

      The discussion of related studies has been updated. Lines 447-457: “Few studies, however, have characterized the impact of electrical stimulation via macroelectrodes on the spiking activity of human cortical neurons, none of which involve intracranial theta burst stimulation. One study reported a long-lasting reduction in neural excitability among parietal neurons, with variable onset time and recovery following continuous transcranial TBS in non-human primates (Romero et al., 2022). In a similar vein, it was recently shown that human neurons are largely suppressed by single-pulse electrical stimulation (Cowan et al., 2024; Plas et al., 2024). Other emerging evidence suggests that transcranial direct current stimulation may entrain the rhythm rather than rate of neuronal spiking (Krause et al., 2019) and that stimulation-evoked modulation of spiking may meaningfully impact behavioral performance on cognitive tasks (Fehring et al., 2024).”

      Comment 5: “What information does the pseudo-population analysis add? It's not totally clear to me.”

      We recognize the need to further contextualize the motivation for the exploratory pseudo-population analysis and appreciate the reviewer for bringing the lack of detail to our attention. In brief, the analysis allowed us to observe trends in activity across populations of neurons, which, in principle, are not visible by characterizing modulation solely in discrete neurons. Additional details have been incorporated into the manuscript, as suggested.

      Change to Text:

      Additional justification has been incorporated in the description of the methodology. Lines 185-187: “…This approach enables the identification of dominant patterns of coordinated neural activity that may not be apparent when examining individual neurons in isolation.”, lines 192-194: “…By collapsing across subjects into a common pseudo-population, this analysis provides a mesoscale view of how stimulation modulates shared activity patterns across anatomically distributed neural populations.”

      A summary interpretation has been added to the paragraph describing the results. Lines 326-328: “Taken together, these analyses reveal global structure in the state space of responses to BLA stimulation within hippocampal circuits.”

      Reviewer #2 (Public review):

      Comment 1 “Authors suggest that the units modulated by stimulation are largely distinct from those responsive to image offset during trials without stimulation. The subpopulation that responds strongly also tends to have a higher baseline of firing rate. It's important to add that the chosen modulation index is more likely to be significant in neurons with higher firing rates.”

      This is an important point that was not previously addressed in our manuscript. We suspect there are likely two factors at play worth considering with respect to our chosen nonparametric modulation index: neurons with lower activity require smaller changes in spike counts to be significantly modulated (easier to flip ranks), and neurons with higher activity empirically exhibit greater absolute shifts in the number of spikes. Our further use of permutation testing, while mitigating false positives, may also somewhat constrain the ability to detect modulation in sparsely active neurons. Nonetheless, given that many trials entailed few or no spikes, we believe this approach is preferable to alternatives that may be more susceptible to noise (e.g., percent change in trial-averaged firing rate from baseline).

      To better understand the tradeoffs with detection probability, we performed a sensitivity analysis. We generated synthetic data with different baseline firing rates (0.1-5.0 Hz) and effect sizes (± 0.1-0.7 Hz) and simulated the likelihood of detection with our given modulation index across neurons. The results of the simulation support the notion that the probability of detecting modulation is lower for sparsely active neurons (Figure S8C). Further discussion of this consideration for the chosen modulation index, as well as details regarding the sensitivity analysis, have been incorporated into the manuscript.

      Change to Text:

      Created Figure S7C (Detection probability analysis)

      Caption: The same permutation-based analyses reported in the manuscript were repeated under different control conditions… (C) Visualization of the predicted probability of detecting modulation across synthetic neurons with variable firing rates and modulation effect sizes; FR = firing rate.

      Lines 223-224 have been added to the Methods section titled “Firing Rate Control Analyses”: “We performed a series of control analyses to test whether our approach to firing rate detection was robust…”

      A description of the simulation has been incorporated into the same section as above. Lines 234-237: “Finally, to better understand the tradeoffs with our statistical approach, we generated synthetic data with different baseline firing rates (0.1-5.0 Hz) and effect sizes (± 0.1-0.7 Hz), then simulated the likelihood of detecting modulation across variable conditions (Figure S7C).”

      The description of the results from the control analyses has been updated. Lines 330-339: “Finally, we performed three supplementary analyses to evaluate the robustness of our approach to detecting firing rate modulation: a sensitivity analysis assessing the proportion of modulated units at different firing rate thresholds for inclusion/exclusion, a data dropout analysis designed to control for the possibility that non-physiological stimulation artifacts may preclude the detection of temporally adjacent spiking, and a synthetic detection probability analysis. These results recapitulate our observation that units with higher baseline firing are most likely to exhibit modulation (though the probability of detecting modulation is lower for sparsely active neurons) and suggest that suppression in firing rate is not solely attributable to amplifier saturation following stimulation (Figure S7).

      Comment 2: “Readers can benefit from understanding with more details the locations chosen for stimulation - in light of previous studies that found differences between effects based on proximity to white matter (For example - PMID 32446925, Mohan et al, Brain Stimul. 2020 and PMID 33279717 Mankin et al Brain Stimul. 2021).”

      This has been addressed in the above response to Reviewer’s 1 comment 1.1e.

      Change to Text:

      See changes related to Reviewer 1 comment 1.1e.

      Comment 3: “Missing information in the manuscript…”

      3a: “Images of stimulation anatomical locations for all subjects included in this study. Ideally information about the impedance of the contacts to be able to calculate the actual current used.”

      As requested, we have provided an image from the coronal T1 MRI sequence, which highlights the position of the stimulated contacts for each of the 16 patients. Though we did not measure the impedances directly, the stimulation was current-controlled, which ensured that the desired current and charge density were consistent regardless of the tissue or electrode impedance.

      Change to Text:

      Created Figure S1 (Anatomical location of stimulated electrodes).

      Caption: A coronal slice from the T1-weighted MRI scan is shown for each patient who participated in the study (n = 16). Electrode contacts within the same plane of the image are shown with blue circles, and the bipolar pair of stimulated contacts within the basolateral amygdala is highlighted in red.

      Lines 144-145 have been edited to reflect that the delivered stimulation was current-controlled: “Specifically, we administered current-controlled, charge-balanced, …”

      3b: “The studied population is epilepsy patients, and the manuscript lacks description of their condition, proximity to electrodes included in the study to pathological areas, and the number of units from each patient/hemisphere.”

      We agree that additional information regarding patient demographics, experimental details, and clinical characteristics would further contextualize this unique patient population. A new table has been included, which contains the following information: patient ID, sex, age, # experimental session, # SEEG leads (and # microelectrodes), # detected units (L vs. R hemisphere), and suspected seizure onset zone.

      Change to Text:

      Created Table S1 (Patient demographics and clinical characteristics).

      Lines 258-259 have been added: “…(see Table S1 for patient demographics).”

      3c: “I haven't seen any comments on code availability (calculating modulation indices and statistics) and data sharing.”

      For clarification, a section titled Resource Availability is already appended to the end of the manuscript following the Conclusion, which describes the data and code availability.

      Change to Text:

      None

      3d: “Small comment - Figure legend 3E - Define gray markers (non-modulated units?)”

      Thank you for highlighting this omission. We have updated the relevant figure caption.

      Change to Text:

      The following has been added to the Figure 3 caption: “…whereas units without a significant change in activity are shown in grey.”

    1. eLife Assessment

      This study presents an important discovery regarding the diversity and evolution of gall-forming microbial effectors. Supported by convincing computational structural predictions and analyses, the research provides insights into the unique mechanisms by which gall-forming microbes exert their pathogenicity in plants. This study also offers guidance that is of value for future studies on pathogen effector function and co-evolution with host plants.

    2. Reviewer #1 (Public review):

      Summary:

      This manuscript presents a comprehensive structure-guided secretome analysis of gall-forming microbes, providing valuable insights into effector diversity and evolution. The authors have employed AlphaFold2 to predict the 3D structures of the secretome from selected pathogens and conducted a thorough comparative analysis to elucidate commonalities and unique features of effectors among these phytopathogens.

      Strengths:

      The discovery of conserved motifs such as 'CCG' and 'RAYH' and their central role in maintaining the overall fold is an insightful finding. Additionally, the discovery of a nucleoside hydrolase-like fold conserved among various gall-forming microbes is interesting.

      Weaknesses:

      Important conclusions are not verified by experiments.

      Comments on revisions: I acknowledge the authors' revision efforts.

    3. Reviewer #2 (Public review):

      Summary:

      Soham Mukhopadhyay et al. investigated the protein folding of the secretome from gall-forming microbes using the AI-based structure-modeling tool AlphaFold2. Their study analyzed six gall-forming species, including two Plasmodiophorid species and four others spanning different kingdoms, along with one non-gall-forming Plasmodiophorid species, Polymyxa betae. The authors found no effector fold specifically conserved among gall-forming pathogens, leading to the conclusion that their virulence strategies are likely achieved through diverse mechanisms. However, they identified an expansion of the Ankyrin repeat family in two gall-forming Plasmodiophorid species, with a less pronounced presence in the non-gall-forming Polymyxa betae. Additionally, the study revealed that known effectors such as CCG and AvrSen1 belong to sequence-unrelated but structurally similar (SUSS) effector clusters.

      Strengths:

      (1) The bioinformatics analyses presented in this study are robust, and the AlphaFold2-derived resources deposited in Zenodo provide valuable resources for researchers studying plant-microbe interactions. The manuscript is also logically organized and easy to follow.

      (2) The inclusion of the non-gall-forming Polymyxa betae strengthens the conclusion that no effector fold is specifically conserved in gall-forming pathogens and highlights the specific expansion of the Ankyrin repeat family in gall-forming Plasmodiophorids.

      (3) Figure 4a and 4b effectively illustrate the SUSS effector clusters, providing a clear visual representation of this finding.

      (4) Figure 1 is a well-designed, comprehensive summary of the number and functional annotations of putative secretomes in gall-forming pathogens. Notably, it reveals that more than half of the analyzed effectors lack known protein domains in some pathogens, yet some were annotated based on their predicted structures, despite the absence of domain annotations.

      Weaknesses:

      (1) The effector families discussed in this paper remain hypothetical in terms of their functional roles, which is understandable given the challenges of demonstrating their functions experimentally. However, this highlights the need for experimental validation as a next step.

      Authors' response: Thank you. Yes, there is a lot of work to do in the coming years.

      Reviewer's response: Incorporating experimental validation substantially strengthened the manuscript. Did you try the AlphaFold-Multimer prediction of the interaction between PBTT_00818 and the GroES-like protein? Does the model indicate a high-confidence interface?

      (2) Some analyses, such as those in Figure 4e, emphasize motifs derived from sequence alignments of SUSS effector clusters. Since these effectors are sequence-unrelated, sequence alignments might be unreliable. It would be more rigorous to perform structure-based alignments in addition to sequence-based ones for motif confirmation. For instance, methods described in Figure 3E of de Guillen et al. (2015, https://doi.org/10.1371/journal.ppat.1005228) or tools like Foldseek could be useful for aligning structures of multiple sequences.

      Authors' response: In Fig. 4e, we highlight the conserved cysteine residues. While there is no clearly conserved overall motif, the figure illustrates that despite the high sequence divergence, the key cysteines involved in disulfide-bridge formation are consistently conserved across the sequences.

      Reviewer's response: Understood. Nevertheless, if a reliable sequence alignment can indeed be generated, I would interpret this to mean that the CCG effectors constitute a highly diversified family rather than being truly sequence unrelated. By comparison, members of the MAX effector family share a common fold, yet their sequences are so divergent that sequence alignment is impossible.

      (3) When presenting AlphaFold-generated structures, it is essential to include confidence scores such as pLDDT and PAE. For example, in Figure 1D of Derbyshire and Raffaele (2023, https://doi.org/10.1038/s41467-023-40949-9), the structural representations were colored red due to their high pLDDT scores, emphasizing their reliability.

      Authors' response: Thank you for the observation. Due to the restrictive parameters used in our analysis, over 90 % of the structure would appear red. For this reason, we chose not to include the color scale, as it would not provide additional informative value in this context.

      Reviewer's response: Understood.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      This manuscript presents a comprehensive structure-guided secretome analysis of gall-forming microbes, providing valuable insights into effector diversity and evolution. The authors have employed AlphaFold2 to predict the 3D structures of the secretome from selected pathogens and conducted a thorough comparative analysis to elucidate commonalities and unique features of effectors among these phytopathogens.

      Strengths:

      The discovery of conserved motifs such as 'CCG' and 'RAYH' and their central role in maintaining the overall fold is an insightful finding. Additionally, the discovery of a nucleoside hydrolase-like fold conserved among various gall-forming microbes is interesting.

      Weaknesses:

      Important conclusions are not verified by experiments.

      Thank you very much. There are many aspects of this study that could be further validated, each potentially requiring years of work. Therefore, we chose to focus on two specific hypotheses: are AlphaFol-Multimer predictions accurate? Can ANK target more than one host protein? Particularly, we focused on the identification of putative targets for one of the ankyrin repeat proteins, PBTT_00818 (Fig. 6). Using one-by-one yeast two-hybrid (Y2H) assays, we tested the AlphaFold-Multimer prediction of an interaction between PBTT_00818 and MPK3. The interaction did not occur in yeast, suggesting it might not take place under those conditions.

      This negative result led us to perform a Y2H screen using an Arabidopsis cDNA library, which identified a GroES-like protein, highly expressed in roots, as a potential target of the ANK effector. Surprisingly, both the PBTT_00818–MPK3 and PBTT_00818–GroES-like protein interactions were later confirmed in planta using BiFC assays. These findings suggest two key points: (1) AlphaFold predictions can be accurate for ANK proteins, and (2) ANK domains, known for mediating protein-protein interactions, may enable these effectors to target multiple host proteins.

      Although the precise biological implications remain unclear, it is possible that ANK proteins act as scaffolds or adaptors for other effectors during infection. The validations presented here open exciting avenues for further research into the role of ANK proteins in Plasmodiophorid pathogenesis and gall formation. This is presented in the corrected preprint and Fig. 7, Table S12, Fig. S7-S8.

      Reviewer #2 (Public review):

      Summary:

      Soham Mukhopadhyay et al. investigated the protein folding of the secretome from gall-forming microbes using the AI-based structure modeling tool AlphaFold2. Their study analyzed six gall-forming species, including two Plasmodiophorid species and four others spanning different kingdoms, along with one non-gall-forming Plasmodiophorid species, Polymyxa betae. The authors found no effector fold specifically conserved among gall-forming pathogens, leading to the conclusion that their virulence strategies are likely achieved through diverse mechanisms. However, they identified an expansion of the Ankyrin repeat family in two gall-forming Plasmodiophorid species, with a less pronounced presence in the non-gall-forming Polymyxa betae. Additionally, the study revealed that known effectors such as CCG and AvrSen1 belong to sequence-unrelated but structurally similar (SUSS) effector clusters.

      Strengths:

      (1) The bioinformatics analyses presented in this study are robust, and the AlphaFold2-derived resources deposited in Zenodo provide valuable resources for researchers studying plant-microbe interactions. The manuscript is also logically organized and easy to follow.

      (2) The inclusion of the non-gall-forming Polymyxa betae strengthens the conclusion that no effector fold is specifically conserved in gall-forming pathogens and highlights the specific expansion of the Ankyrin repeat family in gall-forming Plasmodiophorids.

      (3) Figure 4a and 4b effectively illustrate the SUSS effector clusters, providing a clear visual representation of this finding.

      (4) Figure 1 is a well-designed, comprehensive summary of the number and functional annotations of putative secretomes in gall-forming pathogens. Notably, it reveals that more than half of the analyzed effectors lack known protein domains in some pathogens, yet some were annotated based on their predicted structures, despite the absence of domain annotations.

      Weaknesses:

      (1) The effector families discussed in this paper remain hypothetical in terms of their functional roles, which is understandable given the challenges of demonstrating their functions experimentally. However, this highlights the need for experimental validation as a next step.

      Thank you. Yes, there is a lot of work to do in the coming years.

      (2) Some analyses, such as those in Figure 4e, emphasize motifs derived from sequence alignments of SUSS effector clusters. Since these effectors are sequence-unrelated, sequence alignments might be unreliable. It would be more rigorous to perform structure-based alignments in addition to sequence-based ones for motif confirmation. For instance, methods described in Figure 3E of de Guillen et al. (2015, https://doi.org/10.1371/journal.ppat.1005228) or tools like Foldseek could be useful for aligning structures of multiple sequences.

      In Fig. 4e, we highlight the conserved cysteine residues. While there is no clearly conserved overall motif, the figure illustrates that despite the high sequence divergence, the key cysteines involved in disulfide bridge formation are consistently conserved across the sequences.

      (3) When presenting AlphaFold-generated structures, it is essential to include confidence scores such as pLDDT and PAE. For example, in Figure 1D of Derbyshire and Raffaele (2023, https://doi.org/10.1038/s41467-023-40949-9), the structural representations were colored red due to their high pLDDT scores, emphasizing their reliability.

      Thank you for the observation. Due to the restrictive parameters used in our analysis, over 90% of the structure would appear red. For this reason, we chose not to include the color scale, as it would not provide additional informative value in this context.

      Reviewer #1 (Recommendations for the authors):

      Experimental validation of the significance of 'CCG' and 'RAYH' motifs would further strengthen this study.

      Regarding the Mig1-like protein in Ustilago maydis, the presence of four conserved cysteine residues that are pivotal for maintaining the stability of its folded structure raises an intriguing question. Specifically, while many Mig cluster effectors contain four cysteine residues that form two conserved disulfide bridges, this structure is notably absent in the Mig protein itself. The author has speculated that these four cysteine residues form two conserved disulfide bonds, which are crucial for the stability of Mig protein folding. However, this hypothesis remains unvalidated. To test this prediction, it would be prudent to simulate mutations in the cysteine residues corresponding to the disulfide bonds in Mig and employ molecular dynamics simulations to assess the stability of folding before and after the mutation.

      Mig-1 does contain the four conserved cysteine residues responsible for forming disulfide bridges. However, due to the high divergence among Mig-1-like sequences, the alignment software was unable to properly align all the cysteine residues. As a result, Mig-1 may appear to lack these conserved cysteines in the alignment, although they are indeed present upon individual inspection. This is an area that research groups working with U. maidis as a model could explore further to expand our understanding of this effector family.

      Could you please clarify why talking about Ankyrins and LRR in Arabidopsis thaliana (line 252)? Additionally, what are the structural and functional differences between the LRR sequences of P. brassicae and those of the host plants?

      This sentence refers to the identification of the ANK motif in P. brassicae and S. spongospora, not in Arabidopsis thaliana. While the hydrophobic core of the ANK domains appears conserved between the host and the pathogen, the surface residues are highly polymorphic.

      The evidence supporting the interaction between the ANK effector and Arabidopsis immunity-related proteins, as validated using AlphaFold-Multimer, is currently limited. To enhance the reliability of these data, it is advisable for the author to select several pairs of proteins predicted to interact for further experimental verification.

      We conducted a large-scale yeast two-hybrid (Y2H) screen using the ANK domain effector PBTT_00818, which was selected due to its high iPTM+pTM score. The Y2H interactions were subsequently validated through BiFC assays. Our results show that PBTT_00818 interacts with Arabidopsis MPK3 in the nucleus, consistent with predictions from the AlphaFold2-multimer model. In addition, PBTT_00818 was also found to target AT3G56460, a GroES-like zinc-binding alcohol dehydrogenase, also localized in the nucleus.

      While the manuscript is well-composed, certain sections could be enhanced for clarity and readability. For example, the discussion section could be expanded to include a more in-depth analysis of the implications of the findings for understanding the virulence mechanisms of gall-forming microbes. Additionally, a comparison of the findings with previous studies on related pathogens would provide a more comprehensive perspective.

      Certain sections of the discussion have been expanded. However, we chose to focus on the novel aspects of the study and to avoid comparisons with other plant pathogens, as those mechanisms are already well known and extensively studied. Studies using AlphaFold in plant pathology are also limited.

      *Reviewer #2 (Recommendations for the authors):*

      The results of clustering analyses are highly dependent on the chosen thresholds. Given that the authors provide clear and well-designed visualizations of SUSS effectors in Figures 4a and 4b, applying the same presentation methods to Figures 5a and 5b could make these analyses more convincing.

      We were able to generate the all-vs-all matrix for Figures 4a and 4b because it involved only 13 proteins. However, Figure 5b includes over 40 effectors, making it impractical to visualize the data in the same way. Instead, we presented the sequence-based clusters as nodes and connected them based on structural similarity.

    1. eLife Assessment

      This valuable study presents computational analyses of over 5,000 predicted extant and ancestral nitrogenase structures. The data analyses are convincing, it offers unique insights into the relationship between structural evolution and environmental and biological phenotypes. The data generated in this study provide a vast resource that can serve as a starting point for studies of reconstructed and extant nitrogenases.

    2. Reviewer #1 (Public review):

      This was a clearly written manuscript that did an excellent job summarizing complex data. In this manuscript, Cuevas-Zuviría et al. use protein modeling to generate over 5,000 predicted structures of nitrogenase components, encompassing both extant and ancestral forms across different clades. The study highlights that key insertions define the various Nif groups. The authors also examined the structures of three ancestral nitrogenase variants that had been previously identified and experimentally tested. These ancestral forms were shown in earlier studies to exhibit reduced activity in Azotobacter vinelandii, a model diazotroph.

    3. Reviewer #2 (Public review):

      Summary:

      This work aims to study the evolution of nitrogenanses, understanding how their structure and function adapted to changes in environment, including oxygen levels and changes in metal availability.

      The study predicts > 3000 structures of nitrogenases, corresponding to extant, ancestral and alternative ancestral sequences. It is observed that structural variations in the nitrogenases correlate with phylogenetic relationships. The amount of data generated in this study represents a massive and admirable undertaking. The study also provides strong insight into how structural evolution correlates with environmental and biological phenotypes.

    4. Author response:

      The following is the authors’ response to the previous reviews

      Reviewer #1 (Public review):

      Comments on revisions:

      I appreciate the authors responding to my comments. I think Fig. S10 helps put the structural data into more context. It would be helpful to make clearer in the legend what proteins are being compared, especially in 10C.

      Although I can see why the authors focus on the NifK extension and its potential connection to oxygen protection, I would point out that Vnf and Anf do not have this extension in their K subunit, and you find both Vnf and Anf in aerobic and facultative anaerobic diazotrophs. This is a minor point, but I think it is important to mention in the discussion.

      We thank the reviewer for their thoughtful comments. We now added an additional line to the Discussion following their recommendation and moved Figure S10 to main text.

      Reviewer #2 (Public review):

      Summary: 

      This work aims to study the evolution of nitrogenanses, understanding how their structure and function adapted to changes in environment, including oxygen levels and changes in metal availability. 

      The study predicts > 3000 structures of nitrogenases, corresponding to extant, ancestral and alternative ancestral sequences. It is observed that structural variations in the nitrogenases correlate with phylogenetic relationships. The amount of data generated in this study represents a massive and admirable undertaking. The study also provides strong insight into how structural evolution correlates with environmental and biological phenotypes. 

      We thank the reviewer for their summary and positive appraisal.

    1. eLife Assessment

      This fundamental study characterizes the mechanics and stability of bolalipids from archaeal membranes using a minimalist, physics-based computational model. The authors present a robust mesoscale model of bolalipids-containing membranes, systematically evaluating it across diverse membrane configurations. The results are compelling, demonstrating that the incorporation of bolalipids and regular bilayer lipids in archaeal membranes significantly enhances membrane fluidity and structural stability.

    2. Reviewer #2 (Public review):

      Summary:

      The authors aimed to understand the biophysical properties of archeal membranes made of bolalipids. Bacterial and eukaryotic membranes are made of lipids that self-assemble into bilayers. Archea, instead, use bolalipids, lipids that have two headgroups and can span the entire bilayer. The authors wanted to determine if the unique characteristics of archaea, which are often extremophiles, are in part due to the fact that their membranes contain bolalipids.

      The authors develop a minimal computational model to compare the biophysics of bilayers made of lipids, bolalipids, and mixtures of the two. Their model enables them to determine essential parameters such as bilayer phase diagrams, mechanical moduli, and the bilayer behavior upon cargo inclusion and remodeling.

      The author demonstrates that bolalipid bilayers behave as binary mixtures, containing bolalipids organized either in a straight conformation, spanning the entire bilayer, or in a u-shaped one, confined to a single leaflet. This dynamic mixture allows bolalipid bilayers to be very sturdy but also provides remodeling. However, remodeling is energetically more expensive than with standard lipids. The authors speculate that this might be why lipids were more abundant in the evolutionary process.

      Strengths:

      This is a wonderful paper, a very fine piece of scholarship. It is interesting from the point of view of biology, biophysics, and material science. The authors mastered the modeling and analysis of these complex systems. The evidence for their findings is really strong and complete. The paper is written superbly, the language is precise and the reading experience very pleasant. The plots are very well-thought.

      Weaknesses:

      None. The authors have addressed all the potential weaknesses that were raised by the reviewers.

    3. Reviewer #3 (Public review):

      Summary:

      The authors have studied the mechanics of bolalipid and archaeal mixed-lipid membranes via comprehensive molecular dynamics simulations. The Cooke-Deserno 3-bead-per-lipid model is extended to bolalipids with 6 bead. Phase diagrams, bending rigidity, mechanical stability of curved membranes, and cargo uptake are studied. Effects such as formation of U-shaped bolalipids, pore formation in highly curved regions, and changes in membrane rigidity are studied and discussed. The main aim has been to show how the mixture of bolalipids and regular bilayer lipids in archaeal membrane models enhances the fluidity and stability of these membranes.

      The authors have presented a wide range of simulation results for different membrane conditions and conformations. Analyses and findings are presented clearly and concisely. Figures, supplementary information and movies are of very high quality and very well present what has been studied. The manuscript is well written and is easy to follow.

      The authors have provided detailed response to the points I raised on the first version and have revised their manuscript accordingly. Hence, I only mention what, in my opinion, still deserves to be noted.

      Comments:

      I previously raised an issue with respect to the resort to the Hamm-Kozlov model for fitting the power spectrum of membrane undulations. The authors provided very nice arguments against my concerns. For the sake of completeness, I include a simple scenario, which will better highlight the issue:

      The tilt contribution to the Helfrich Hamiltonian can be written as a quadratic term 1/2 k_t |T|^2, where T is a tilt vector field. This field is written as the difference between the surface normal and the director field aligned with the lipid orientations. In the small deviation Monge description with z=h(x, y) as the height function, the surface normal has the form N=(-dh/dx, -dh/dy, 1). Now assume the director field, n = (b_x, b_y, 1) with small b_x and b_y components. The tilt contribution to the energy thus reads as 1/2 k_t (N - n)^2 ~= 1/2 k_t [|grad h|^2 + 2 b . grad h]. The first term, 1/2 k_t |grad h|^2, is indeed similar to a surface tension term, \sigma |grad h|^2 that you get from the (1 + 1/2 |grad h|^2) approximation to the area element. Therefore, if you only look at height fluctuations, while your membrane actually has some surface tension, it will make distinguishing the tilt contributions to the fluctuations in the linear Monge gauge impossible.

      However, considering that the authors have made sure that the membrane is indeed tensionless, this argument is settled.

      I had also raised an issue about the correct NpT sampling in the simulations, and I'm glad that the authors also set up more rigorously thermostatted/barostatted simulations to check the validity of their findings.

      Also, from the SI, I previously noted that the authors had neglected the longest wavelength mode because it was not equilibrated. This was an important problem and the authors looked into it and ran more simulations that were better equilibrated.

      The analysis of energy of U-shaped lipids with the linear model E=c_0 + c_1 * k_bola is indeed very interesting. I am glad that the authors have expanded this analysis and included mean energy measurements.

    1. eLife Assessment

      This compelling work describes how the cell cycle-regulating phosphatase subunit, RepoMan, is regulated by the oxygen-dependent, metabolite-sensing hydroxylase PHD1. The characterisation of how proline hydroxylation alters signalling at the molecular and cellular level provides important evidence to enhance our understanding of how 2-oxoglutarate-dependent dioxygenases influence the cell cycle and mitosis.

    2. Reviewer #1 (Public review):

      Summary:

      The study by Druker et al. shows that siRNA depletion of PHD1, but not PHD2, increases H3T3 phosphorylation in cells arrested in prometaphase. Additionally, the expression of wild-type RepoMan, but not the RepoMan P604A mutant, restored normal H3T3 phosphorylation localization in cells arrested in prometaphase. Furthermore, the study demonstrates that expression of the RepoMan P604A mutant leads to defects in chromosome alignment and segregation, resulting in increased cell death. These data support a role for PHD1-mediated prolyl hydroxylation in controlling progression through mitosis. This occurs, at least in part, by hydroxylating RepoMan at P604, which regulates its interaction with PP2A during chromosome alignment.

      Strengths:

      The data support most of the conclusions made. However, some issues need to be addressed.

      Weaknesses:

      (1) Although ectopically expressed PHD1 interacts with ectopically expressed RepoMan, there is no evidence that endogenous PHD1 binds to endogenous RepoMan or that PHD1 directly binds to RepoMan.

      (2) There is no genetic evidence indicating that PHD1 controls progression through mitosis by catalyzing the hydroxylation of RepoMan.

      (3) Data demonstrating the correlation between dynamic changes in RepoMan hydroxylation and H3T3 phosphorylation throughout the cell cycle are needed.

      (4) The authors should provide biochemical evidence of the difference in binding ability between RepoMan WT/PP2A and RepoMan P604A/PP2A.

      (5) PHD2 is the primary proline hydroxylase in cells. Why does PHD1, but not PHD2, affect RepoMan hydroxylation and subsequent control of mitotic progression? The authors should discuss this issue further.

    3. Reviewer #2 (Public review):

      Summary:

      This is a concise and interesting article on the role of PHD1-mediated proline hydroxylation of proline residue 604 on RepoMan and its impact on RepoMan-PP1 interactions with phosphatase PP2A-B56 complex leading to dephosphorylation of H3T3 on chromosomes during mitosis. Through biochemical and imaging tools, the authors delineate a key mechanism in the regulation of the progression of the cell cycle. The experiments performed are conclusive with well-designed controls.

      Strengths:

      The authors have utilized cutting-edge imaging and colocalization detection technologies to infer the conclusions in the manuscript.

      Weaknesses:

      Lack of in vitro reconstitution and binding data.

    4. Reviewer #3 (Public review):

      Summary:

      The manuscript is a comprehensive molecular and cell biological characterisation of the effects of P604 hydroxylation by PHD1 on RepoMan, a regulatory subunit of the PPIgamma complex. The identification and molecular characterisation of the hydroxylation site have been written up and deposited in BioRxiv in a separate manuscript. I reviewed the data and came to the conclusion that the hydroxylation site has been identified and characterised to a very high standard by LC-MS, in cells and in vitro reactions. I conclude that we should have no question about the validity of the PHD1-mediated hydroxylation.

      In the context of the presented manuscript, the authors postulate that hydroxylation on P604 by PHD1 leads to the inactivation of the complex, resulting in the retention of pThr3 in H3.

      Strengths:

      Compelling data, characterisation of how P604 hydroxylation is likely to induce the interaction between RepoMan and a phosphatase complex, resulting in loading of RepoMan on Chromatin. Loss of the regulation of the hydroxylation site by PHD1 results in mitotic defects.

      Weaknesses:

      Reliance on a Proline-Alanine mutation in RepoMan to mimic an unhydroxylatable protein. The mutation will introduce structural alterations, and inhibition or knockdown of PHD1 would be necessary to strengthen the data on how hydroxylates regulate chromatin loading and interactions with B56/PP2A.

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      We appreciate the reviewer’s agreement that our data, "support most of the conclusions made”.

      With respect to Concerns raised by reviewer 1:

      (1) Although ectopically expressed PHD1 interacts with ectopically expressed RepoMan, there is no evidence that endogenous PHD1 binds to endogenous RepoMan or that PHD1 directly binds to RepoMan.

      We do not fully agree that this comment is accurate - the implication is that we only show interaction between two exogenously expressed proteins, i.e. both exogenous  PHD1 and RepoMan, when in fact we show that tagged PHD1 interacts with endogenous RepoMan. The major technical challenge here is the well known difficulty of detetcing endogenous PHD1 in such cell lines. We agree that co-IP studies do not prove that this interaction is direct and never claim to have shown this, though we do feel that a direct interaction is most likely, albeit not proven.

      (2) There is no genetic evidence indicating that PHD1 controls progression through mitosis by catalyzing the hydroxylation of RepoMan.

      We agree that our current study is primarily a biochemical and cell biological study, rather than a genetic study. Nonetheless, similar biochemical and cellular approaches have been widely used and validated in previous studies in mechanisms regulating cell cycle progression and we are confident in the conclusions drawn based on the data obtained so far.

      (3) Data demonstrating the correlation between dynamic changes in RepoMan hydroxylation and H3T3 phosphorylation throughout the cell cycle are needed.

      We agree that it will be very interesting to analyse in more detail the cell cycle dynamics of RepoMan hydroxylation and H3T3 phosphorylation - along with other cell cycle parameters. We view this as outside the scope of our present study and are actively engaged in raising the additional funding needed to pursue such future experiments.

      (4) The authors should provide biochemical evidence of the difference in binding ability between RepoMan WT/PP2A and RepoMan P604A/PP2A.

      Here again we agree that it will be very interesting to analyse in future the detailed binding interactions between wt and mutant RepoMan and other interacting proteins, including PP2A. We view this as outside the scope of our present study and are actively engaged in raising the additional funding needed to pursue such future experiments.

      (5) PHD2 is the primary proline hydroxylase in cells. Why does PHD1, but not PHD2, affect RepoMan hydroxylation and subsequent control of mitotic progression? The authors should discuss this issue further.

      We agree with the main point underlining this comment, i.e., that there are still many things to be learned concerning the specific roles and mechanisms of the different PHD enzymes in vivo. We look forward to addressing these questions in future studies.

      Reviewer #2 (Public review):

      We appreciate the reviewer’s comments that our manuscript uses biochemical and imaging tools to delineate a key mechanism in the regulation of the progression of the cell cycle and their appreciation that our experiments performed are, 'conclusive with well-designed controls.'

      With respect to the specific Concern raised by reviewer 2:

      Lack of in vitro reconstitution and binding data.

      We agree that it will be very interesting to pursue in vitro reconstitution studies and detailed binding data. We view this as outside the scope of our present study and are actively engaged in raising the additional funding needed to pursue such future experiments.

      Reviewer #3 (Public review):

      We appreciate the reviewer’s comments that our study, “is a comprehensive molecular and cell biological characterisation of the effects of P604 hydroxylation by PHD1 on RepoMan, a regulatory subunit of the PPIgamma complex” and their conclusion that, “we should have no question about the validity of the PHD1-mediated hydroxylation”.

      With respect to the specific Concern raised by reviewer 3:

      Reliance on a Proline-Alanine mutation in RepoMan to mimic an unhydroxylatable protein. The mutation will introduce structural alterations, and inhibition or knockdown of PHD1 would be necessary to strengthen the data on how hydroxylates regulate chromatin loading and interactions with B56/PP2A.

      We do not agree that we rely solely on analysis of the single site pro-ala mutatin in RepoMan for our conclusions, since we also present a raft of additional experimental evidence, including knock-down data and experiments using both fumarate and FG. We would also reference the data we present on RepoMan in the parallel study by Jiang et al, which has also been reviewed by eLife and is currently available on biorxiv (doi: https://doi.org/10.1101/2025.05.06.652400). Of course we agree with the reviewer that even although the muatnt RepoMan features only a single amino acid change, this could still result in undetermined structural effects on the RepoMan protein that could conceivably contribute, at least in part, to some of the phenotypic effects observed. Hopefully future studies will help to clarify this.

    1. eLife Assessment

      This manuscript presents solid experimental data using Fmr1 knockout mice to explore the fundamental role of Fmr1 in sleep regulation. The study supports the hypothesis that scheduled feeding can improve circadian rhythm and behavior in a mouse model of Fragile X syndrome. These findings may offer new insights into neurodevelopmental disorders and their potential treatment strategies.

    2. Reviewer #1 (Public review):

      The authors conducted a comprehensive investigation into sleep and circadian rhythm disturbances in Fmr1 knockout (KO) mice, a model for Fragile X Syndrome (FXS). They began by monitoring daily home cage behaviors to identify disruptions in sleep and circadian patterns, then assessed the mice's adaptability to altered light conditions through photic suppression and skeleton photoperiod experiments. To uncover potential mechanisms, they examined the connectivity between the retina and the suprachiasmatic nucleus. The study also included an analysis of social behavior deficits in the mutant mice and tested whether scheduled feeding could alleviate these issues. Notably, scheduled feeding not only improved sleep, circadian, and social behaviors but also normalized plasma cytokine levels. The manuscript is strengthened by its focus on a significant and underexplored area-sleep deficits in an FXS model-and by its robust experimental design, which integrates a variety of methodological approaches to provide a thorough understanding of the observed phenomena and potential therapeutic avenues.

    3. Reviewer #2 (Public review):

      Summary:

      In the present study, the authors, using a mouse model of Fragile X syndrome, explore the intriguing hypothesis that restricting food access over the daily schedule will improve sleep patterns and subsequently enhance behavioral capacities. By restricting food access from 12h to 6h over the nocturnal period (the active period for mice), they show, in these KO mice, an improvement in the sleep pattern accompanied by reduced systemic levels of inflammatory markers and improved behavior. These data, using a classical mouse model of neurodevelopmental disorder (NDD), suggest that modifying eating patterns might improve sleep quality, leading to reduced inflammation and enhanced cognitive/behavioral capacities in children with NDD.

      Overall, the paper is well-written and easy to follow. The rationale of the study is generally well introduced. Data are globally sound. The interpretation is overall supported by the provided data.

    4. Author response:

      The following is the authors’ response to the previous reviews

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Thank you for the extensive response to my comments and questions.

      Reviewer #2 (Recommendations for the authors):

      (1) The Fmr1/Fxr2 double KO mice are not well described in the Introduction.

      We have changed the sentence in the introduction to clarify that in Zhang et al ., 2008 they used a mouse lacking both the Fmr1 gene and its paralog Fxr2.

      (3) The Authors decided not to discuss the potential translation of the present study to human patients, despite their final conclusion statement.

      The paragraph below has been added to the end of the discussion:

      “Translational Implications”

      The present findings support the view that circadian disruption is not merely a downstream consequence of disease processes but actively contributes to symptom expression. Hence, the possibility that interventions designed to reinforce circadian rhythms can hold therapeutic value for individuals with FXS and related neurodevelopmental conditions. Given that sleep and circadian dysfunction are detectable early in development and are predictive of more severe clinical phenotypes, circadian-based interventions may be particularly beneficial if applied during periods of heightened neural plasticity. Importantly, time-restricted feeding represents a relatively low-cost, non-invasive strategy that could be feasibly implemented in realworld settings. Further translational work is needed to evaluate whether the mechanistic links identified here—between circadian misalignment, immune dysregulation, and behavioral impairments—are conserved in humans, and similar approaches can be implemented for clinical use.

    1. eLife Assessment

      This study presents an important finding on the signaling mechanisms underlying Treg cell homeostasis by identifying the simultaneous requirement of diacylglycerol (DAG) kinases (DGK) alpha and zeta for Foxp3+ Treg cell function and follicular responses, with implications for the pathogenesis of some autoimmune diseases. Whereas data based on the characterization of double knock-out mice (for DGK alpha and zeta) is solid, showing the emergence of autoimmune manifestations, the study has gaps in its experimental approaches since it is not clear what can be attributed to the simultaneous DKGα and ζ deficiency, versus the individual deficiency of either one. Experiments on the pathogenic potential of the DKO Tregs in the absence of other T-cells were not presented and results on the role of CD25 downregulation and CD28-independent activation of Treg cells were not properly discussed. Nonetheless, the reported data would be of interest to immunologists working on T-cell intracellular signaling and autoimmunity.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript by Li and colleagues describes the impact of deficiency on the DKGα and ζ on Treg cells and follicular responses. The experimental approach is based on the characterization of double KO mice that show the emergence of autoimmune manifestations that include the production of autoantibodies. Additionally, there is an increase in Tfh cells, but also Tfr cells in these mice deficient in both DKGα and ζ. Although the observations are interesting, the interpretation of the observations is difficult in the absence of data related to single mutations. While a supplementary figure shows that the autoimmune manifestations are more severe in the DKGα and ζ deficient mice, prior observations show that a single DKGα deficiency has an impact on Treg homeostasis. As such, the contribution of the two chains to the overall phenotype is hard to establish.

      Strengths:

      Well-conducted experiments with informative mouse models with defined genetic defects.

      Weaknesses:

      The major weakness is the lack of clarity concerning what can be attributed to simultaneous DKGα and ζ deficiency versus deficiency on DKGα or ζ alone. Technical concerns related to a number of figures were raised in the initial report and not adequately addressed by the authors in the revised manuscript.

      In conclusion, the claims in the manuscript are not convincingly supported by the data,

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Li et al investigates the combined role of diacylglycerol (DAG) kinases (DGK) a and z in Foxp3+ Treg cells function that prevent autoimmunity. The authors generated DGK a and z Treg-specific double knock out mice (DKO) by crossing Dgkalpha-/- mice to DgKzf/f and Foxp3YFPCre/+ mice. The resulting "DKO" mice thus lack DGK a in all cells and DGK and z in Foxp3+Treg cells. The authors show that the DKO mice spontaneously develop autoimmunity, characterized by multiorgan inflammatory infiltration and elevated anti double strand DNA (dsDNA), -single strand DNA (ssDNA), and -nuclear autoantibodies. The authors attribute the DKO mice phenotype to Foxp3+Treg dysfunction, including accelerated conversion into "exTreg" cells with pathogenic activity. Interestingly, the combined deficiency of DGK a and z seems to release Treg cell dependence on CD28-mediated costimulatory signals, which the authors show by crossing their DKO mice to CD28-/- mice (TKO mice), which also develop autoimmunity.

      Strengths:

      The phenotypes of the mutant mice described in the manuscript are striking, and the authors provide a comprehensive analysis of the functional processes alters by the lack of DGKs.

      Weaknesses:

      One aspect that could be better explored is the direct role of "ex-Tregs" in causing pathogenesis in the models utilized.

      But overall, this is an important report that makes a significant addition to the understanding of DAG kinases to Treg cells biology.

    4. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The manuscript by Li and colleagues describes the impact of deficiency on the DKGα and ζ on Treg cells and follicular responses. The experimental approach is based on the characterization of double KO mice that show the emergence of autoimmune manifestations that include the production of autoantibodies. Additionally, there is an increase in Tfh cells, but also Tfr cells in these mice deficient in both DKGα and ζ. Although the observations are interesting, the interpretation of the observations is difficult in the absence of data related to single mutations. While a supplementary figure shows that the autoimmune manifestations are more severe in the DKGα and ζ deficient mice, prior observations show that a single DKGα deficiency has an impact on Treg homeostasis. As such, the contribution of the two chains to the overall phenotype is hard to establish.

      Strengths:

      Well-conducted experiments with informative mouse models with defined genetic defects.

      Weaknesses:

      The major weakness is the lack of clarity concerning what can be attributed to simultaneous DKGα and ζ deficiency versus deficiency on DKGα or ζ alone.

      Some interpretations are also not conclusively supported by data.

      We appreciate the reviewer 1’s positive comments about our manuscript and for the suggestion to include DGKα‑ or DGKζ‑single‑knockout (SKO) Tregs for the mechanistical studies. Unfortunately, performing this sound simple but truly extensive experiment would exceed our current budget and personnel capacity. Importantly, it is well known that DGKα and DGKζ act redundantly or synergistically in T cells, with single loss producing minimal or partial phenotypes compared with the double knockout. The comprehensive mechanistic data already presented for DGKαζ‑DKO Tregs therefore capture the combined functional and mechanistical deficit that is most relevant to DGK functions in Treg biology, and they support the conclusions drawn in this manuscript. The reviewer also pointed out some interpretation issues such as CD25 down regulation in Tfr cells and some minor issues. We appreciate the reviewer’s expertise and have revised the text and discussion accordingly.

      Reviewer #2 (Public review):

      Summary:

      In this manuscript, Li et al investigate the combined role of diacylglycerol (DAG) kinases (DGK) α and ζ in Foxp3+ Treg cells function that prevent autoimmunity. The authors generated DGK α and ζ Treg-specific double knockout mice (DKO) by crossing Dgkalpha-/- mice to DgKzf and Foxp3YFPCre/+ mice. The resulting "DKO" mice thus lack DGK α in all cells and DGK ζ in Foxp3+Treg cells. The authors show that the DKO mice spontaneously develop autoimmunity, characterized by multiorgan inflammatory infiltration and elevated anti-double-strand DNA (dsDNA), -single-strand DNA (ssDNA), and -nuclear autoantibodies. The authors attribute the DKO mice phenotype to Foxp3+Treg dysfunction, including accelerated conversion into "exTreg" cells with pathogenic activity. Interestingly, the combined deficiency of DGK α and ζ seems to release Treg cell dependence on CD28-mediated costimulatory signals, which the authors show by crossing their DKO mice to CD28-/- mice (TKO mice), which also develop autoimmunity.

      Strengths:

      The phenotypes of the mutant mice described in the manuscript are striking, and the authors provide a comprehensive analysis of the functional processes altered by the lack of DGKs.

      Weaknesses:

      One aspect that could be better explored is the direct role of "ex-Tregs" in causing pathogenesis in the models utilized.

      However, overall, this is an important report that makes a significant addition to the understanding of DAG kinases in Treg cell biology.

      We greatly appreciate reviewer 2’s positive comments about the manuscript. The data we presented in the manuscript show that DGKαζDKO Tregs but not WT Tregs are able to trigger autoimmunity in T cell deficient mice in the presence of WT CD4 T cells support that DGKαζDKO Tregs are pathogenic. Reviewer 2 suggested to test the direct role of DGKαζDKO Treg/ex-Tregs in the pathogenesis of autoimmune diseases in the absence of conventional T cells. This is really an interesting idea that we will test it in the future should recourse for executing the experiment become available.

    1. eLife Assessment

      This important study decoded target-associated information in prefrontal and sensory cortex during the preparatory period of a visual search task, suggesting a memory component of human subjects performing such visual attention task. The evidence supporting this claim is compelling, based on multivariate pattern analyses of fMRI data. The results will be of interest to psychologists and cognitive neuroscientists.

    2. Reviewer #1 (Public review):

      When you search for something, you need to maintain some representation (a "template") of that target in your mind/brain. Otherwise, how would you know what you were looking for? If your phone is in a shocking pink case, you can guide your attention to pink things based on a target template that includes the attribute 'pink'. That guidance should get you to the phone pretty effectively, if it is in view. Most real-world searches are more complicated. If you are looking for the toaster, you will make use of your knowledge of where toasters can be. Thus, if you are asked to find a toaster, you might first activate a template of a kitchen or a kitchen counter. You might worry about pulling up the toaster template only after you are reasonably sure you have restricted your attention to a sensible part of the scene.

      Zhou and Geng are looking for evidence of this early stage of guidance by information about the surrounding scene in a search task. They train Os to associate four faces with four places. Then, with Os in the scanner, they show one face - the target for a subsequent search. After an 8 sec delay, they show a search display where the face is placed on the associated scene 75% of the time. Thus, attending to the associated scene is a good idea. The questions of interest are "When can the experimenters decode which face Os saw from fMRI recording?" "When can the experimenters decode the associated scene?" and "Where in the brain can the experimenters see evidence of this decoding? The answer is that the face but not the scene can be read out during the face's initial presentation. The key finding is that the scene can be read out (imperfectly but above chance) during the subsequent delay when Os are looking at just a fixation point. Apparently, seeing the face conjures up the scene in the mind's eye.

      This is a solid and believable result. The only issue, for me, is whether it is telling us anything specifically about search. Suppose you trained Os on the face-scene pairing but never did anything connected to search. If you presented the face, would you not see evidence of recall of the associated scene? Maybe you would see the activation of the scene in different areas and you could identify some areas as search specific. I don't think anything like that was discussed here.

      You might also expect this result to be asymmetric. The idea is that the big scene gives the search information about the little face. The face should activate the larger useful scene more than the scene should activate the more incidental face, if the task was reversed. That might be true if finding is related to search where the scene context is presumed to be the useful attention guiding stimulus. You might not expect an asymmetry if Os were just learning an association.

      It is clear in this study that the face and the scene have been associated and that this can be seen in the fMRI data. It is also clear that a valid scene background speeds the behavioral response in the search task. The linkage between these two results is not entirely clear but perhaps future research will shed more light.

      It is also possible that I missed the clear evidence of the search-specific nature of the activation by the scene during the delay period. If so, I apologize and suggest that the point be underlined for readers like me.

      Comments on revised version:

      I am satisfied with the revision.

    3. Reviewer #2 (Public review):

      Summary:

      This work is one of the best instances of a well-controlled experiment and theoretically impactful findings within the literature on templates guiding attentional selection. I am a fan of the work that comes out of this lab and this particular manuscript is an excellent example as to why that is the case. Here, the authors use fMRI (employing MVPA) to test whether during the preparatory search period, a search template is invoked within the corresponding sensory regions, in the absence of physical stimulation. By associating faces with scenes, a strong association was created between two types of stimuli that recruit very specific neural processing regions - FFA for faces and PPA for scenes. The critical results showed that scene information that was associated with a particular cue could be decoded from PPA during the delay period. This result strongly supports invoking of a very specific attentional template.

      Strengths:

      There is so much to be impressed with in this report. The writing of the manuscript is incredibly clear. The experimental design is clever and innovative. The analysis is sophisticated and also innovative. The results are solid and convincing.

      Weaknesses:

      I only have a few weaknesses to point out.<br /> This point is not so much of a weakness, but a further test of the hypothesis put forward by the authors. The delay period was long - 8 seconds. It would be interesting to split the delay period into the first 4seconds and the last 4seconds and run the same decoding analyses. The hypothesis here is that semantic associations take time to evolve, and it would be great to show that decoding gets stronger in the second delay period as opposed to the period right after the cue. I think it would be a stronger test of the template hypothesis.

      Typo in the abstract "curing" vs "during."

      It is hard to know what to do with significant results in ROIs that are not motivated by specific hypotheses. However, for Figure 3, what are explanations for ROIs that show significant differences above and beyond the direct hypotheses set out by the authors?

      Following the revision, I have no further comments or concerns.

    4. Reviewer #3 (Public review):

      The manuscript contains a carefully designed fMRI study, using MVPA patter analysis to investigate which high-level associate cortices contain target-related information to guide visual search. A special focus is hereby on so-called 'target-associated' information, that has previously been shown to help in guiding attention during visual search. For this purpose the author trained their participants and made them learn specific target-associations, in order to then test which brain regions may contain neural representations of those learnt associations. They found that at least some of the associations tested were encoded in prefrontal cortex during the cue and delay period.

      The manuscript is very carefully prepared. As far as I can see, the statistical analyses are all sound and the results integrate well with previous findings.

      I have no strong objections against the presented results and their interpretation.

      The authors have addressed all my previous comments and questions in their revision of the text.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      When you search for something, you need to maintain some representation (a "template") of that target in your mind/brain. Otherwise, how would you know what you were looking for? If your phone is in a shocking pink case, you can guide your attention to pink things based on a target template that includes the attribute 'pink'. That guidance should get you to the phone pretty effectively if it is in view. Most real-world searches are more complicated. If you are looking for the toaster, you will make use of your knowledge of where toasters can be. Thus, if you are asked to find a toaster, you might first activate a template of a kitchen or a kitchen counter. You might worry about pulling up the toaster template only after you are reasonably sure you have restricted your attention to a sensible part of the scene.

      Zhou and Geng are looking for evidence of this early stage of guidance by information about the surrounding scene in a search task. They train Os to associate four faces with four places. Then, with Os in the scanner, they show one face - the target for a subsequent search. After an 8 sec delay, they show a search display where the face is placed on the associated scene 75% of the time. Thus, attending to the associated scene is a good idea. The questions of interest are "When can the experimenters decode which face Os saw from fMRI recording?" "When can the experimenters decode the associated scene?" and "Where in the brain can the experimenters see evidence of this decoding? The answer is that the face but not the scene can be read out during the face's initial presentation. The key finding is that the scene can be read out (imperfectly but above chance) during the subsequent delay when Os are looking at just a fixation point. Apparently, seeing the face conjures up the scene in the mind's eye.

      This is a solid and believable result. The only issue, for me, is whether it is telling us anything specifically about search. Suppose you trained Os on the face-scene pairing but never did anything connected to the search. If you presented the face, would you not see evidence of recall of the associated scene? Maybe you would see the activation of the scene in different areas and you could identify some areas as search specific. I don't think anything like that was discussed here.

      You might also expect this result to be asymmetric. The idea is that the big scene gives the search information about the little face. The face should activate the larger useful scene more than the scene should activate the more incidental face, if the task was reversed. That might be true if the finding is related to a search where the scene context is presumed to be the useful attention guiding stimulus. You might not expect an asymmetry if Os were just learning an association.

      It is clear in this study that the face and the scene have been associated and that this can be seen in the fMRI data. It is also clear that a valid scene background speeds the behavioral response in the search task. The linkage between these two results is not entirely clear but perhaps future research will shed more light.

      It is also possible that I missed the clear evidence of the search-specific nature of the activation by the scene during the delay period. If so, I apologize and suggest that the point be underlined for readers like me.

      We have added text related to this issue, particularly in the discussion (page 19, line 6), and have also added citations of studies in humans and non-human primates showing a causal relationship between preparatory activity in prefrontal and visual cortex and visual search performance (page 6, line 16).

      Reviewer #2 (Public review):

      Summary:

      This work is one of the best instances of a well-controlled experiment and theoretically impactful findings within the literature on templates guiding attentional selection. I am a fan of the work that comes out of this lab and this particular manuscript is an excellent example as to why that is the case. Here, the authors use fMRI (employing MVPA) to test whether during the preparatory search period, a search template is invoked within the corresponding sensory regions, in the absence of physical stimulation. By associating faces with scenes, a strong association was created between two types of stimuli that recruit very specific neural processing regions - FFA for faces and PPA for scenes. The critical results showed that scene information that was associated with a particular cue could be decoded from PPA during the delay period. This result strongly supports the invoking of a very specific attentional template.

      Strengths:

      There is so much to be impressed with in this report. The writing of the manuscript is incredibly clear. The experimental design is clever and innovative. The analysis is sophisticated and also innovative. The results are solid and convincing.

      Weaknesses:

      I only have a few weaknesses to point out.<br /> This point is not so much of a weakness, but a further test of the hypothesis put forward by the authors. The delay period was long - 8 seconds. It would be interesting to split the delay period into the first 4seconds and the last 4seconds and run the same decoding analyses. The hypothesis here is that semantic associations take time to evolve, and it would be great to show that decoding gets stronger in the second delay period as opposed to the period right after the cue. I don't think this is necessary for publication, but I think it would be a stronger test of the template hypothesis.

      We conducted the suggested analysis, and we did not find clear evidence of differences in decoding scene information between the earlier and later portions of the delay period. This may be due to insufficient power when the data are divided, individual differences in when preparatory activation is the strongest, or truly no difference in activation over the delay period. More details of this analysis can be found in the supplementary materials (page 12, line 16; Figure S1).

      Type in the abstract "curing" vs "during."

      Fixed.

      It is hard to know what to do with significant results in ROIs that are not motivated by specific hypotheses. However, for Figure 3, what are the explanations for ROIs that show significant differences above and beyond the direct hypotheses set out by the authors?

      We added reasoning for the other a priori ROIs in the introduction (page 4, line 26). There is substantial evidence suggesting that frontoparietal areas are involved in cognitive control, attentional control, and working memory. The ROIs we selected from frontal and parietal cortex are based on parcels within resting state networks defined by the s17-network atlases (Schaefer et al., 2018). The IFJ was defined by the HCP-MMP1 (Glasser et al., 2016). These regions are commonly used in studies of attention and cognitive control, and the exact ROIs selected are described in the section on “Regions of interest (ROI) definition”. While we have the strongest hypothesis for IFJ based on relatively recent work from the Desimone lab, the other ROIs in lateral frontal cortex and parietal cortex, are also well documented in similar studies, although the exact computation being done by these regions during tasks can be hard to differentiate with fMRI.\

      Reviewer #3 (Public review):

      The manuscript contains a carefully designed fMRI study, using MVPA pattern analysis to investigate which high-level associate cortices contain target-related information to guide visual search. A special focus is hereby on so-called 'target-associated' information, that has previously been shown to help in guiding attention during visual search. For this purpose the author trained their participants and made them learn specific target-associations, in order to then test which brain regions may contain neural representations of those learnt associations. They found that at least some of the associations tested were encoded in prefrontal cortex during the cue and delay period.

      The manuscript is very carefully prepared. As far as I can see, the statistical analyses are all sound and the results integrate well with previous findings.

      I have no strong objections against the presented results and their interpretation.

      Reviewer #1 (Recommendations for the authors):

      One bit of trivia. In the abstract, you should define IFJ on its first appearance in the text. You get to that a bit later.

      Fixed.

      Reviewer #2 (Recommendations for the authors):

      I really don't have much to suggest, as I thought that this was a clearly written report that offered a clever paradigm and data that supported the conclusions. My only suggestion would be to split the delay period activity and test whether the strength of the template evolves over time. Even though fMRI is not the best tool for this, still you would predict stronger decoding in the second half of the delay period

      Please see above for our response to the same comment.

      Reviewer #3 (Recommendations for the authors):

      I would just like to point out some minor aspects that might be worth improving before publishing this work.

      Abstract: While in general, the writing is clear and concise, I felt that the abstract of the manuscript was particularly hard to follow, probably because the authors at some point re-arranged individual sentences. For example, they write in line 12 about 'the preparatory period', but explain only in the following sentence that the preparatory period ensues 'before search begins'. This made it a bit hard to follow the overall logic and I think could easily be fixed. 

      We have addressed this comment and updated the abstract.

      Also in the abstract: 'The CONTENTS of the template typically CONTAIN...' sounds weird, no? Also, 'information is used to modulate sensory processing in preparation for guiding attention during search' sounds like a very over-complicated description of attentional facilitation. I'm not convinced either whether the sequence is correct here. Is the information really used to (first) modulate sensory processing (which is a sort of definition of attention in itself) to (then) prepare the guidance of attention in visual search?

      We have addressed this comment and updated the abstract.

      The sentence in line 7, 'However, many behavioral studies have shown that target-associated information is used to guide attention,...' (and the following sentence) assumes that the reader is somewhat familiar with the term 'target-associations'. I'm afraid that, for a naive reader, this term may only become fully understandable once the idea is introduced a bit later when mentioning that participants of the study were trained on face-scene pairings. I think it could help to give some very short explanation of 'target-associations' already when it is first mentioned. The term 'statistically co-occurring object pairs', for example, could be of great help here.

      Thank you for the suggestion. We have added it to the abstract.

      page 2, line 22: 'prefrotnal'

      Fixed.

      page 2, line 24/25: 'information ... can SUPPLANT (?) ... information'. (That's also a somewhat unfortunate repetition of 'information')

      Fixed.

      page 4, line 23-25: 'Working memory representations in lateral prefrontal and parietal regions are engaged in cognitive control computations that ARE (?) task non-specific but essential to their functioning'

      Fixed.

      page 7, line 1: maybe a comma before 'suggesting'?

      Fixed.

      page 7, line 14-16: Something seems wrong with this sentence: 'The distractor face was a race-gender match, which we previously FOUND MADE (?) target discrimination difficult enough to make the scene useful for guiding attention'

      We have addressed this comment and rewritten this part (now on page 7, line 18).

      Results / Discussion sections:

      In several figures, like in Fig3A, the three different IFJ regions, are grouped separately from the other frontal areas, which makes sense given the special role IFJ plays for representing task-related templates. However, IFJ is still part of PFC. I think it would be more correct to group the other frontal areas (like FEF vLPFC etc.) as 'Other Frontal' or even 'Other PFC'.

      We have made the changes based on the reviewer’s suggestion.

      In some of the Figures, e.g. Fig 3 and 5, I had the impression that the activation patterns of some conditions in vLPFC were rather close to the location of IFJ, which is just a bit posterior. I think I remember that functional localisers of IFJ can actually vary quite a bit in localisation (see e.g. in the Baldauf/Desimone paper). Also, I think it has been shown in the context of other regions, like the human FEF that its position when defined by localisation tasks is not always nicely and fully congruent with the respective labels in an atlas like the Glasser atlas. It might help to take this in consideration when discussing the results, particularly since the term vLPFC is a rather vague collection of several brain parcels and not a parcel name in the Glasser atlas. Some people might even argue that vLPFC in the broad sense contains IFJ, similar to how 'Frontal' contains IFJ (see above). How strong of a point do the authors want to make about activation in IFJ versus in vlPFC?

      We have now added text discussing the inability to truly differentiate between subregions of IFJ and other parts of vLPFC in the methods section on ROIs (page 25, line 13) and in the discussion (page 18, line 25). However, one might think that it is even more surprising given the likely imprecision of ROI boundaries that we see distinct patterns between the subregions of IFG defined by Glasser HCP-MMP1 and the other vLPFC regions defined by the 17-network atlases. We do not wish to overstate the precision of IFJ regions, but note the ROI results within the context of the larger literature. We are sure that our findings will have to be reinterpreted when newer methods allow for better localization of functional subregions of the vLPFC in individuals.

      Given that the authors nicely explain in the introduction how important templates are in visual search, and given that FEF has such an important role in serially guiding saccades through visual search templates, I think it would be worth discussing the finding that FEF did not hold representation of these targets. Of course, this could be in part due to the specific task at hand, but it may still be interesting to note in the Discussion section that here FEF, although important for some top-down attention signals, did not keep representations of the 'search' templates. Is it because there is no spatial component to the task at hand (like proposed in Bedini 2021)?

      We have now added text directly addressing this point and citing the Bedini et al. paper in the discussion (page 18, line 18). Besides our current findings, the relationship between IFJ and FEF is really interesting and will hopefully be investigated more in the future.

      Page 18, line 5: 'we the(N) associated...'

      Fixed.

    1. eLife Assessment

      This manuscript by Li, Lu et al., presents important findings on the role of cDC1 in atherosclerosis and their influence on the adaptive immune system. Using Xcr1Cre-Gfp Rosa26LSL-DTA ApoE-/- mouse models, these data convincingly reveal an unexpected, non-redundant role of the XCL1-XCR1 axis in mediating cDC1 contributions to atherosclerosis.

    2. Reviewer #1 (Public review):

      Summary:

      In this study by Li et al., the authors re-investigated the role of cDC1 for atherosclerosis progression using the ApoE model. First, the authors confirmed the accumulation of cDC1 in atherosclerotic lesions in mice and humans. Then in order to examine the functional relevance of this cell type, the authors developed a new mouse model to selectively target cDC1. Specifically, they inserted the Cre recombinase directly after the start codon of endogenous XCR1 gene, thereby avoiding off-target activity. Following validation of this model, the authors crossed it with ApoE-deficient mice and found a striking reduction of aortic lesions (numbers and size) following high fat diet. The authors further characterized the impact of cDC1 depletion on lesional T cells and their activation state. Also, they provide in-depth transcriptomic analyses of lesional in comparison to splenic and nodal cDC1. These results imply cellular interactions between lesion T cells and cDC1. Finally, the authors show that the chemokine XCL1, which is produced by activated CD8 T cells (and NK cells) plays a key role for the interaction with XCR1-expressing cDC1 and particularly for the atherosclerotic disease progression.

      Strengths:

      The surprising results on XCL1 represent a very important gain in knowledge. The role of cDC1 is clarified with a new genetic mouse model.

      Comments on revised version:

      The authors have addressed my concerns in the revised version of this manuscript.

    3. Reviewer #2 (Public review):

      This study investigates the role of cDC1 in atherosclerosis progression using Xcr1Cre-Gfp Rosa26LSL-DTA ApoE-/- mice. The authors demonstrate that selective depletion of cDC1 reduces atherosclerotic lesions in hyperlipidemic mice. While cDC1 depletion did not alter macrophage populations, it suppressed T cell activation (both CD4+ and CD8+ subsets) within aortic plaques. Further, targeting the chemokine Xcl1 (ligand of Xcr1) effectively inhibits atherosclerosis. The manuscript is well-written, and data are clearly presented. The data provided in the article can well support the author's conclusion.

      Comments on revised version:

      The authors have addressed all previous concerns and made appropriate revisions to the data. I have no further questions.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      In this study by Li et al., the authors re-investigated the role of cDC1 for atherosclerosis progression using the ApoE model. First, the authors confirmed the accumulation of cDC1 in atherosclerotic lesions in mice and humans. Then, in order to examine the functional relevance of this cell type, the authors developed a new mouse model to selectively target cDC1. Specifically, they inserted the Cre recombinase directly after the start codon of the endogenous XCR1 gene, thereby avoiding off-target activity. Following validation of this model, the authors crossed it with ApoE-deficient mice and found a striking reduction of aortic lesions (numbers and size) following a high-fat diet. The authors further characterized the impact of cDC1 depletion on lesional T cells and their activation state. Also, they provide in-depth transcriptomic analyses of lesional in comparison to splenic and nodal cDC1. These results imply cellular interactions between lesion T cells and cDC1. Finally, the authors show that the chemokine XCL1, which is produced by activated CD8 T cells (and NK cells), plays a key role in the interaction with XCR1-expressing cDC1 and particularly in the atherosclerotic disease progression.<br /> Strengths:

      The surprising results on XCL1 represent a very important gain in knowledge. The role of cDC1 is clarified with a new genetic mouse model.

      Thank you

      Weaknesses:

      My criticism is limited to the analysis of the scRNAseq data of the cDC1. I think it would be important to match these data with published data sets on cDC1. In particular, the data set by Sophie Janssen's group on splenic cDC1 might be helpful here (PMID: 37172103; https://www.single-cell.be/spleen_cDC_homeostatic_maturation/datasets/cdc1). It would be good to assign a cluster based on the categories used there (early/late, immature/mature, at least for splenic DC).

      Thank you very much for your help. Using the scRNA seq data of Xcr1<sup>+</sup> cDC1 sorted from ApoE<sup>–/–</sup> mice, we re-annotated the populations, following the methodology proposed by Sophie Janssen's group. These results are presented in Figure S9 and Figure S10 and described in detail in the Results and Discussion section.

      Please refer to the Results section from line 264 to 284: “Using the scRNA seq data of Xcr1<sup>+</sup> cDC1 sorted from hyperlipidemic mice, we annotated the 10 populations as shown in Figure S9A, following the methodology from a previous study [41]. Ccr7<sup>+</sup> mature cDC1s (Cluster 3, 7 and 9) and Ccr7- immature cDC1s (remaining clusters) were identified across cDC1 cells sorted from aorta, spleen and lymph nodes (Figure S9B). Further stratification based on marker genes reveals that Cluster 10 is the pre-cDC1, with high expression level of CD62L (Sell) and low expression level of CD8a (Figure S9C). Cluster 6 and 8 are the proliferating cDC1s, which express high level of cell cycling genes Stmn1 and Top2a (Figure S9D). Cluster 1 and 4 are early immature cDC1s, and cluster 2 and 5 are late immature cDC1s, according to the expression pattern of Itgae, Nr4a2 (Figure S9E). Cluster 9 cells are early mature cDC1s, with elevated expression of Cxcl9 and Cxcl10 (Figure S9F). Cluster 3 and 7 as late mature cDC1s, characterized by the expression of Cd63 and Fscn1 (Figure S9G). As shown in Figure 5C and Figure S9, the 10 populations displayed a major difference of aortic cDC1 cells that lack in pre-cDC1s (cluster 10) and mature cells (cluster 3, 7 and 9). Interestingly, in hyperlipidemic mice splenic cDC1 possess only Cluster 3 as the late mature cells while the lymph node cDC1 cells have two late mature populations namely Cluster 3 and Cluster 7. In further analysis, we also compared splenic cDC1 cells from HFD mice to those from ND mice. As shown in Figure S10, HFD appears to impact early immature cDC1-1 cells (Cluster 1) and increases the abundance of late immature cDC1 cells (Cluster 2 and 5), regardless of the fact that all 10 populations are present in two origins of samples. We also found that Tnfaip3 and Serinc3 are among the most upregulated genes, while Apol7c and Tifab are downregulated in splenic cDC1 cells sorted from HFD mice”.  

      Please refer to the Discussion section from line 380 to 385: “Based on the maturation analysis of the cDC1 scRNA seq data [41], our findings suggest that the aortic cDC1 cells display a major difference from those of spleen and lymph nodes by lacking the mature clusters, whereas lymph node cDC1 cells contain an additional Fabp5<sup>+</sup> S100a4<sup>+</sup> late mature Cluster. Our results also suggest that hyperlipidemia contributes to alteration in early immature cDC1 and in the abundance of late immature cDC1 cells, which was associated with dramatic change in gene expression of Tnfaip3, Serinc3, Apol7c and Tifab”.

      Reviewer #2 (Public review):

      This study investigates the role of cDC1 in atherosclerosis progression using Xcr1Cre-Gfp Rosa26LSL-DTA ApoE-/- mice. The authors demonstrate that selective depletion of cDC1 reduces atherosclerotic lesions in hyperlipidemic mice. While cDC1 depletion did not alter macrophage populations, it suppressed T cell activation (both CD4+ and CD8+ subsets) within aortic plaques. Further, targeting the chemokine Xcl1 (ligand of Xcr1) effectively inhibits atherosclerosis. The manuscript is well-written, and the data are clearly presented. However, several points require clarification:

      (1) In Figure 1C (upper plot), it is not clear what the Xcr1 single-positive region in the aortic root represents, or whether this is caused by unspecific staining. So I wonder whether Xcr1 single-positive staining can reliably represent cDC1. For accurate cDC1 gating in Figure 1E, Xcr1+CD11c+ co-staining should be used instead.

      The observed false-positive signal in the wavy structures within immunofluorescence Figure 1C (upper panel) results from the strong autofluorescence of elastic fibers, a major vascular wall component (alongside collagen). This intrinsic property of elastic fibers is a well-documented confounder in immunofluorescence studies [A, B].

      In contrast, immunohistochemistry (IHC) employs an enzymatic chromogenic reaction (HRP with DAB substrate) that generates a brown precipitate exclusively at antigen-antibody binding sites. Importantly, vascular elastic fibers lack endogenous enzymatic activity capable of catalyzing the DAB reaction, thereby preventing this source of false positivity in IHC.

      Given that Xcr1 is exclusively expressed on conventional type 1 dendritic cells [C], and considering that IHC lacks the multiplexing capability inherent to immunofluorescence for antigen co-localization, single-positive Xcr1 staining reliably identifies cDC1s in IHC results.

      [A] König, K et al. “Multiphoton autofluorescence imaging of intratissue elastic fibers.” Biomaterials vol. 26,5 (2005): 495-500. doi:10.1016/j.biomaterials.2004.02.059

      [B] Andreasson, Anne-Christine et al. “Confocal scanning laser microscopy measurements of atherosclerotic lesions in mice aorta. A fast evaluation method for volume determinations.” Atherosclerosis vol. 179,1 (2005): 35-42. doi:10.1016/j.atherosclerosis.2004.10.040

      [C] Dorner, Brigitte G et al. “Selective expression of the chemokine receptor XCR1 on cross-presenting dendritic cells determines cooperation with CD8+ T cells.” Immunity vol. 31,5 (2009): 823-33. doi:10.1016/j.immuni.2009.08.027

      (2) Figure 4D suggests that cDC1 depletion does not affect CD4+/CD8+ T cells. However, only the proportion of these subsets within total T cells is shown. To fully interpret effects, the authors should provide:

      (a) Absolute numbers of total T cells in aortas.

      (b) Absolute counts of CD4+ and CD8+ T cells.

      Thanks for your suggestions. We agree that assessing both proportions and absolute numbers in Figure 4 provides a more complete picture of the effects of cDC1 depletion on T cell populations. Furthermore, we also add the absolute count of cDC1 cells and total T cells, and CD44 MFI (mean fluorescence intensity) in CD4<sup>+</sup> and CD8<sup>+</sup> T cells in Figure 4, and supplemented corresponding textual descriptions in the revised manuscript.

      Please refer to the Results section from line 183 to 187: “Subsequently, we assessed T cell phenotype in the two groups of mice. While neither the frequencies nor absolute counts of aortic CD4<sup>+</sup> and CD8<sup>+</sup> T cells differed significantly between two groups of mice (Figure 4D-F), CD69 frequency and CD44 MFI (Mean Fluorescence Intensity), the T cell activation markers, were significantly reduced in both CD4<sup>+</sup> and CD8<sup>+</sup> T cells from Xcr1<sup>+</sup> cDC1 depleted mice compared to controls (Figure 4G and H)”.

      (3) How does T cell activation mechanistically influence atherosclerosis progression? Why was CD69 selected as the sole activation marker? Were other markers (e.g., KLRG1, ICOS, CD44) examined to confirm activation status?

      We sincerely appreciate these insightful comments. As extensively documented in the literature, activated effector T cells (both CD4+ and CD8+) critically promote plaque inflammation and instability through their production of pro-inflammatory cytokines (particularly IFN-γ and TNF-α), which drive endothelial activation, exacerbate macrophage inflammatory responses, and impair smooth muscle cell function [A].

      In our study, we specifically investigated the role of cDC1 cells in atherosclerosis progression. Our key findings demonstrate that cDC1 depletion attenuates T cell activation (as shown by reduced CD69/CD44 expression) and that this reduction in activation is functionally linked to the observed decrease in atherosclerosis burden in our model. 

      Regarding CD44 as an activation marker, we performed quantitative analyses of CD44 mean fluorescence intensity (MFI) in aortic T cells (Figure 4). Importantly, the MFI of CD44 was significantly lower on both CD4+ and CD8+ T cells from Xcr1<sup>Cre-Gfp</sup> Rosa26<sup>LSL-DTA</sup> ApoE<sup>–/–</sup> mice compared to the control ApoE<sup>–/–</sup> mice (data shown below), which is consistent with the result of CD69 in Figure 4. We added the related description in the Result section.

      Please refer to the Results section from line 185 to 187 “CD69 frequency and CD44 MFI (Mean Fluorescence Intensity), the T cell activation markers, were significantly reduced in both CD4+ and CD8+ T cells from Xcr1+ cDC1 depleted mice compared to controls (Figure 4G and H)”.

      Similarly, MFI of CD44 was significantly lower on both CD4<sup>+</sup> and CD8<sup>+</sup> T cells from Xcl1<sup>–/–</sup> ApoE<sup>–/–</sup> mice compared to the control ApoE<sup>–/–</sup> mice (data shown below), which is consistent with the result of CD69 in Figure 7. We also added the related description in the Result section.

      Please refer to the Results section from line 308 to 309 “Crucially, CD69<sup>+</sup> frequency and CD44 MFI remained comparable in both aortic CD4<sup>+</sup> and CD8<sup>+</sup> T cells between two groups (Figure 7D-F).”

      [A] Hansson, Göran K, and Andreas Hermansson. “The immune system in atherosclerosis.” Nature immunology vol. 12,3 (2011): 204-12. doi:10.1038/ni.2001

      (4) Figure 7B: Beyond cDC1/2 proportions within cDCs, please report absolute counts of: Total cDCs, cDC1, and cDC2 subsets. Figure 7D: In addition to CD4+/CD8+ T cell proportions, the following should be included:

      (a) Total T cell numbers in aortas

      (b) Absolute counts of CD4+ and CD8+ T cells.

      Thanks for your suggestions. We have now included in Figure 7 the absolute counts of cDC, cDC1, and cDC2 cells, along with CD4<sup>+</sup> and CD8<sup>+</sup> T cells in aortic tissues. Additionally, we provide the corresponding CD44 mean fluorescence intensity (MFI) measurements for both CD4<sup>+</sup> and CD8<sup>+</sup> T cell populations. We added the related description in the Result section.

      Please refer to the Results section from line 303 to 311: “The flow cytometric results illustrated that both frequencies and absolute counts of Xcr1<sup>+</sup> cDC1 cells in the aorta were significantly reduced, but cDCs and cDC2 cells from Xcl1<sup>–/–</sup> ApoE<sup>–/–</sup> were comparable with that from ApoE<sup>–/–</sup> (Figure 7A-C). Moreover, in both lymph node and spleen, the absolute numbers of pDC, cDC1 and cDC2 from Xcl1<sup>–/–</sup> ApoE<sup>–/–</sup> were comparable with that from ApoE<sup>–/–</sup> (Figure S11). Crucially, CD69<sup>+</sup> frequency and CD44 MFI remained comparable in both aortic CD4<sup>+</sup> and CD8<sup>+</sup> T cells between two groups (Figure 7D-F). However, aortic CD8<sup>+</sup> T cells exhibited reduced frequency and absolute count, while CD4<sup>+</sup> T cells showed increased frequency but unchanged counts in Xcl1<sup>–/–</sup> ApoE<sup>–/–</sup> mouse versus controls (Figure 7G and H).”

      (5) cDC1 depletion reduced CD69+CD4+ and CD69+CD8+ T cells, whereas Xcl1 depletion decreased Xcr1+ cDC1 cells without altering activated T cells. How do the authors explain these different results? This discrepancy needs explanation.

      We sincerely appreciate your professional and insightful comments regarding the mechanistic relationship between cDC1 depletion and T cell activation. Direct cDC1 depletion in the Xcr1<sup>Cre-Gfp</sup> Rosa26<sup>LSL-DTA</sup> ApoE<sup>–/–</sup> micmodel removes both recruited and tissue-resident cDC1s, eliminating their multifunctional roles in antigen presentation, co-stimulation and cytokine secretion essential for T cell activation. In contrast, Xcl1 depletion reduces, but does not eliminate cDC1 migration into plaques. Furthermore, alternative chemokine axes (e.g., CCL5/CCR5, CXCL9/CXCR3, BCL9/BCL9L) may partially rescue cDC1 recruitment [13, 68, 69], and non-cDC1 APCs (e.g., monocytes, cDC2s) may compensate for T cell activation [55, 70]. We emphasize that Xcl1 depletion specifically failed to alter T cell activation in hyperlipidemic ApoE<sup>–/–</sup> mice. However, its impact may differ in other pathophysiological contexts due to compensatory mechanisms. We thank you again for highlighting this nuance, which strengthens our mechanistic interpretation. We have added these points to the discussion section and included new references.

      Please refer to the Discussion section from line 407 to 413: “Notably, while complete ablation of Xcr1<sup>+</sup> cDC1s impaired T cell activation, reduction of Xcr1<sup>+</sup> cDC1 recruitment via Xcl1 deletion did not significantly compromise this process. This discrepancy may arise through compensatory mechanisms: alternative chemokine axes (e.g., CCL5/CCR5, CXCL9/CXCR3, BCL9/BCL9L) may partially rescue Xcr1<sup>+</sup> cDC1 homing [13, 68, 69], while non-cDC1 antigen-presenting cells (e.g., monocytes, cDC2s) may sustain T cell activation [55, 70]. Furthermore, tissue-specific microenvironment factors could potentially modulate its role in other diseases.”. [13] Eisenbarth, S C. “Dendritic cell subsets in T cell programming: location dictates function.” Nature reviews. Immunology vol. 19,2 (2019): 89-103. doi:10.1038/s41577-018-0088-1 [55] Brewitz, Anna et al. “CD8+ T Cells Orchestrate pDC-XCR1+ Dendritic Cell Spatial and Functional Cooperativity to Optimize Priming.” Immunity vol. 46,2 (2017): 205-219. doi:10.1016/j.immuni.2017.01.003 [68] de Oliveira, Carine Ervolino et al. “CCR5-Dependent Homing of T Regulatory Cells to the Tumor Microenvironment Contributes to Skin Squamous Cell Carcinoma Development.” Molecular cancer therapeutics vol. 16,12 (2017): 2871-2880. doi:10.1158/1535-7163.MCT-17-0341.[69] He F, Wu Z, Liu C, Zhu Y, Zhou Y, Tian E, et al. Targeting BCL9/BCL9L enhances antigen presentation by promoting conventional type 1 dendritic cell (cDC1) activation and tumor infiltration. Signal Transduct Target Ther. 2024;9(1):139. Epub 2024/05/30. doi: 10.1038/s41392-024-01838-9. PubMed PMID: 38811552; PubMed Central PMCID: PMCPMC11137111.[70] Böttcher, Jan P et al. “Functional classification of memory CD8(+) T cells by CX3CR1 expression.” Nature communications vol. 6 8306. 25 Sep. 2015, doi:10.1038/ncomms9306.

      Reviewer #1 (Recommendations for the authors):

      (1) Line 32 - The authors might want to add that the mouse model leads to a "constitutive" depletion of cDC1.

      Thanks for your advice, we have revised the sentence as follows.

      Please refer to the Results section from line 31 to 33: “we established Xcr1<sup>Cre-Gfp</sup> Rosa26<sup>LSL-DTA</sup> ApoE<sup>–/–</sup> mice, a novel and complex genetic model, in which cDC1 was constitutively depleted in vivo during atherosclerosis development”.

      (2) Line 187-188: The authors claim that T cell activation was "inhibited" if cDC1 was depleted. The data shows that the T cells were less activated, but there is no indication of any kind of inhibition; this should be corrected.

      Thanks for your advice, we have revised the sentence as follows.

      Please refer to the Results section from line 183 to 187: “Subsequently, we assessed T cell phenotype in the two groups of mice. While neither the frequencies nor absolute counts of aortic CD4<sup>+</sup> and CD8<sup>+</sup> T cells differed significantly between two groups of mice (Figure 4D-F), CD69 frequency and CD44 MFI (Mean Fluorescence Intensity), the T cell activation markers, were significantly reduced in both CD4<sup>+</sup> and CD8<sup>+</sup> T cells from Xcr1<sup>+</sup> cDC1 depleted mice compared to controls (Figure 4G and H)”.

      (3) Why are some splenic DC clusters absent in LNs and vice versa? This is not obvious to this reviewer and should at least be discussed.

      We appreciate the insightful question regarding the absence of certain splenic DC clusters in LNs. This phenomenon in Figure 5 aligns with the 'division of labor' paradigm in dendritic cell biology: tissue microenvironments evolve specialized DC subsets to address local immunological challenges. The absence of universal clusters reflects functional adaptation, not technical artifacts. We acknowledge that this tissue-specific heterogeneity warrants further discussion and have expanded our analysis to address this point in the discussion part of our manuscript.

      Please refer to the Discussion section from line 375 to 385: “This pronounced tissue-specific compartmentalization of Xcr1<sup>+</sup> cDC1 subsets may related to multiple mechanisms including developmental imprinting that instructs precursor differentiation into transcriptionally distinct subpopulations [62], and microenvironmental filtering through organ-specific chemokine axes (e.g., CCL2/CCR2 in spleen) selectively recruits receptor-matched subsets [63, 64]. This spatial specialization optimizes pathogen surveillance for local immunological challenges. Based on the maturation analysis of the cDC1 scRNA seq data [41], our findings suggest that the aortic cDC1 cells display a major difference from those of spleen and lymph nodes by lacking the mature clusters, whereas lymph node cDC1 cells contain an additional Fabp5<sup>+</sup> S100a4<sup>+</sup> late mature Cluster. Our results also suggest that hyperlipidemia contributes to alteration in early immature cDC1 and in the abundance of late immature cDC1 cells, which was associated with dramatic change in gene expression of Tnfaip3, Serinc3, Apol7c and Tifab”.

      [62]. Liu Z, Gu Y, Chakarov S, Bleriot C, Kwok I, Chen X, et al. Fate Mapping via Ms4a3-Expression History Traces Monocyte-Derived Cells. Cell. 2019;178(6):1509-25 e19. Epub 2019/09/07. doi: 10.1016/j.cell.2019.08.009. PubMed PMID: 31491389.

      [63]. Bosmans LA, van Tiel CM, Aarts S, Willemsen L, Baardman J, van Os BW, et al. Myeloid CD40 deficiency reduces atherosclerosis by impairing macrophages' transition into a pro-inflammatory state. Cardiovasc Res. 2023;119(5):1146-60. Epub 2022/05/20. doi: 10.1093/cvr/cvac084. PubMed PMID: 35587037; PubMed Central PMCID: PMCPMC10202633.

      [64]. Mildner A, Schonheit J, Giladi A, David E, Lara-Astiaso D, Lorenzo-Vivas E, et al. Genomic Characterization of Murine Monocytes Reveals C/EBPbeta Transcription Factor Dependence of Ly6C(-) Cells. Immunity. 2017;46(5):849-62 e7. Epub 2017/05/18. doi: 10.1016/j.immuni.2017.04.018. PubMed PMID: 28514690.

      [41]. Bosteels V, Marechal S, De Nolf C, Rennen S, Maelfait J, Tavernier SJ, et al. LXR signaling controls homeostatic dendritic cell maturation. Sci Immunol. 2023;8(83):eadd3955. Epub 2023/05/12. doi: 10.1126/sciimmunol.add3955. PubMed PMID: 37172103.

      (4) The authors should discuss how XCL1 could impact lesional cDC1 and T cell abundance. Notably, preDCs do not express XCR1, and T cells express XCL1 following TCR activation. Is there a recruitment or local proliferation defect of cDC1 in the absence of XCL1? Could there also be a role for NK cells as a potential source of XCL1?

      We appreciate your insightful questions regarding the differential effects of Xcl1 on cDC1s and T cells. Xcl1 primarily mediates the recruitment of mature cDC1s. Our data demonstrate that Xcl1 deletion significantly reduces aortic cDC1 abundance, which correlates with a concomitant decrease in CD8<sup>+</sup> T cell numbers within the aorta. These findings strongly suggest that the Xcl1-Xcr1 axis plays a regulatory role in T cell accumulation in aortic plaques.

      Consistent with prior studies [A, B], cDC1 recruitment can occur in the absence of Xcl1 which echoes our findings that cDC1 cells were still found in Xcl1 knockout aortic plaque but in lower abundance. It is very true that further studies are required to address how the Xcl1 dependent and independent cDC1 cells activate T cells and if they possess capability of proliferation in tissue differentially. We have added these points in discussion section.

      Please refer to the Discussion section from line 407 to 415: “Notably, while complete ablation of Xcr1<sup>+</sup> cDC1s impaired T cell activation, reduction of Xcr1<sup>+</sup> cDC1 recruitment via Xcl1 deletion did not significantly compromise this process. This discrepancy may arise through compensatory mechanisms: alternative chemokine axes (e.g., CCL5/CCR5, CXCL9/CXCR3, BCL9/BCL9L) may partially rescue Xcr1<sup>+</sup> cDC1 homing [13, 68, 69], while non-cDC1 antigen-presenting cells (e.g., monocytes, cDC2s) may sustain T cell activation [55, 70]. Furthermore, tissue-specific microenvironment factors could potentially modulate its role in other diseases. In summary, our findings identify Xcl1 as a potential therapeutic target for atherosclerosis therapy, though its cellular origins and regulation of lesional Xcr1<sup>+</sup> cDC1 and T cells dynamics require further studies”.

      In literatures, Xcl1 are expressed in NK cells and subsects of T cells, and NK cells can be a potential source of Xcl1 during atherosclerosis which deserve further investigations [A, C, D].

      [A] Böttcher, Jan P et al. “NK Cells Stimulate Recruitment of cDC1 into the Tumor Microenvironment Promoting Cancer Immune Control.” Cell vol. 172,5 (2018): 1022-1037.e14. doi:10.1016/j.cell.2018.01.004

      [B] He, Fenglian et al. “Targeting BCL9/BCL9L enhances antigen presentation by promoting conventional type 1 dendritic cell (cDC1) activation and tumor infiltration.” Signal transduction and targeted therapy vol. 9,1 139. 29 May. 2024, doi:10.1038/s41392-024-01838-9

      [C] Woo, Yeon Duk et al. “The invariant natural killer T cell-mediated chemokine X-C motif chemokine ligand 1-X-C motif chemokine receptor 1 axis promotes allergic airway hyperresponsiveness by recruiting CD103+ dendritic cells.” The Journal of allergy and clinical immunology vol. 142,6 (2018): 1781-1792.e12. doi:10.1016/j.jaci.2017.12.1005

      [D] Winkels, Holger et al. “Atlas of the Immune Cell Repertoire in Mouse Atherosclerosis Defined by Single-Cell RNA-Sequencing and Mass Cytometry.” Circulation research vol. 122,12 (2018): 1675-1688. doi:10.1161/CIRCRESAHA.117.312513

      Reviewer #2 (Recommendations for the authors):

      There is a logical error in line 298. I suggest revising to: "Collectively, these data suggest that Xcl1 promotes atherosclerosis by recruiting Xcr1+ cDC1 cells, which subsequently drive T cell activation in lesions."

      Thanks for your advice. Since Xcl1 deficiency reduced both the frequencies and absolute counts of Xcr1+ cDC1 and CD8+ T cells in lesions without affecting T cell activation, we revised the sentence as you suggested.

      Please refer to the Results section from line 314 to 315: “Collectively, these data suggest that Xcl1 promotes atherosclerosis by recruiting Xcr1<sup>+</sup> cDC1 cells, and facilitating CD8<sup>+</sup> T cell accumulation in lesions”.

    1. eLife Assessment

      This important study elucidates the molecular function of the SARS-CoV-2 helicase NSP13, which inhibits the transcriptional activity of the YAP/TEAD complex in vitro and in vivo. The evidence supporting the authors' claims is compelling, based on cell biological assays and multi-omic studies. This work contributes to the understanding of the new regulatory mechanism of YAP/TEAD after SARS-CoV-2 infection and will be of interest to researchers investigating COVID-19 infection and the Hippo-YAP signaling pathway.

    2. Reviewer #1 (Public review):

      In the revised manuscript, Meng et al. report that SARS-CoV-2 infection suppresses YAP target gene transcription in both patient lung samples and iPSC-derived cardiomyocytes. Among the tested viral proteins, the helicase nonstructural protein 13 (NSP13) was identified as a key factor that impairs YAP/TEAD transcriptional activity. Through mutagenesis and protein-protein interaction studies, the authors propose a mechanism where NSP13 binds YAP/TEAD complex, remodels chromatin structure, and recruits transcriptional repressors to inhibit YAP/TEAD's transcriptional activity.

      Overall, this study uncovers a novel regulation of Hippo signaling by SARS-CoV-2 through NSP13, suggesting a potential role of this growth-related pathway in host innate immune response to viral infection. While these findings are intriguing, future studies are needed to validate the involvement of YAP/TEAD in patient tissues and to assess their potential as therapeutic targets against SARS-CoV-2.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript by Meng et al. describes a role for the coronavirus helicase NSP13 in the regulation of YAP-TEAD-mediated transcription. The authors present data that NSP13 expression in cells reduces YAP-induced TEAD luciferase reporter activity and that NSP13 transduction in cardiomyocytes blocks hyperactive YAP-mutant phenotypes in vivo. Mechanisms by which viral proteins (particularly those from coronaviruses) intersect with cellular signaling events is an important research topic, and the intersection of NSP13 with YAP-TEAD transcriptional activity (independent of upstream Hippo pathway mediated signals) offers new knowledge that is of interest to a broad range of researchers.

      Strengths:

      The manuscript presents convincing data mapping the effects of NSP13 on YAP-TEAD reporter activity to the helicase domain. Moreover, the in vivo data demonstrating that NSP13 expression in YAP5SA mouse cardiomyocytes increased survival animal rates, and restored cardiac function is striking and is supportive of the model presented.

      Weaknesses:

      While there are some hints at the mechanisms by which NSP13 regulates YAP-TEAD activity through the identification of NSP13-associated proteins by mass spec, the relationships and functions of these factors in the context of YAP-TEAD regulation requires further study in the future.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Major points

      (1) The authors discovered a novel regulation of the Hippo-YAP pathway by SARS-CoV-2 infection but did not address the pathological significance of this finding. It remains unclear why YAP downstream gene transcription needs to be inhibited in response to SARS-CoV-2 infection. Is this inhibition crucial for the innate immune response to SARS-CoV-2? The authors should re-analyze their snRNA-seq and bulk RNA-seq data described in Figure 1 to determine whether any of the affected YAP downstream genes are involved in this process.

      We appreciate the reviewer’s suggestion to clarify the pathological significance of YAP pathway inhibition in SARS-CoV-2 infection. To address this, we re-analyzed our snRNA-seq and bulk RNA-seq datasets to determine whether YAP target genes overlap with known mediators of the innate immune response. As described in Fig. 1C, bulk RNA-seq revealed decreased expression of multiple YAP downstream targets linked to innate immune regulation (e.g., Thbs1, Ccl2, Axl, and Csf1) in SARS-CoV-2–infected cells in vitro.

      snRNA-seq of alveolar type I (AT1) cells from COVID-19 patients revealed a more complex landscape: While we observed reduced YAP activity overall (Fig. 1G), multiple YAP target genes involved in innate immunity and cytokine signaling were paradoxically elevated (Supplemental Fig. 1E). Several factors likelt explain these conflicting observations: 1. In the lung, AT1 cells (which are critical for gas exchange) may cell specifically respond to virus infection by upregulating genes related to immune response by other signaling pathway(s); 2. In vivo, SARS-CoV-2 infection triggers a surge in cytokines, chemokines, and other local factors that can differentially modulate YAP binding sites and thus affect its downstream targets, a complexity not fully captured in vitro; 3. YAP is highly sensitive to mechanical signals and tissue architecture. The 3D structure of altered cell–cell junctions in infected lung tissue, and fluid shear stress in the alveolar space could shape YAP target gene transcription differently from simplified monolayer cell cultures.

      We have expanded the results section of the new version to include the above points. We also acknowledge that ongoing and future work is needed to delineate the exact molecular and tissue-specific pathways through which YAP inhibition confers a potential advantage in combating SARS-CoV-2.

      (2) The authors concluded that helicase activity is required for NSP13-induced inhibition of YAP transcriptional activity based on mutation studies (Figure 3B). This finding is somewhat confusing, as K131, K345/K347, and R567 are all essential residues for NSP13 helicase activity while mutating K131 did not affect NSP13's ability to inhibit YAP (Figure 3B). Additionally, there are no data showing exactly how NSP13 inhibits the YAP/TEAD complex through its helicase function. This point was also not reflected in their proposed working model (Figure 4H).

      We appreciate the reviewer’s concerns regarding the helicase‐dependent inhibition of YAP by NSP13, particularly the roles of K131, K345/K347, and R567. Based on published structural and biochemical studies, each of these residues uniquely supports helicase function (1): K131 is crucial for stabilizing the NSP13 stalk region by interacting with S424. Substituting K131 with alanine (K131A) reduces helicase efficiency but does not completely abolish it; K345/K347 are key DNA‐binding residues, and mutating both (K345A/K347A) largely prevents NSP13 from binding DNA, thus eliminating unwinding. R567 is critical for ATP hydrolysis, and the R567A mutant retains DNA binding capacity but fails to unwind it. In Fig. 3B, K131A suppresses YAP transactivation to nearly the same extent as wild‐type NSP13, suggesting that partial helicase activity is sufficient for complete YAP/TEAD inhibition. Conversely, the K345A/K347A and R567A mutants show markedly diminished repression, underscoring the importance of DNA binding and ATP hydrolysis.

      As the new Fig. 4J illustrates, NSP13 must bind DNA and hydrolyze ATP to unwind nucleic acids. This helicase‐dependent process likely enables NSP13 to remodel chromatin structure by binding TEAD and properly organize YAP repressors at YAP/TEAD complex to prevent YAP/TEAD transactivation. In support of this mechanism, the K345A/K347A mutant, unable to anchor to DNA, fails to repress YAP and slightly increases YAP‐driven transcription (Fig. 3B), presumably by mislocalizing YAP repressors. Likewise, the ATPase‐dead R567A can bind DNA but does not unwind and remodel chromatin to recruit YAP repressors, resulting in a loss of YAP suppression (Fig. 3B and 3F). Our revised model demonstrates that both DNA binding and ATP‐dependent unwinding are essential for NSP13 to suppress YAP transcriptional activity. We have updated the results, discussion, and model accordingly.

      (3) The proposed model that NSP13 binds TEAD4 to recruit repressor proteins and inhibits YAP/TEAD downstream gene transcription (Figure 4H) needs further characterization. Second, NSP13 is a DNA-binding protein, and its nucleic acid-binding mutant K345A/K347A failed to inhibit YAP transcriptional activity (Figure 3B). The authors should investigate whether NSP13 could bind to the TEAD binding sequence or the nearby sequence on the genome to modulate TEAD's DNA binding ability. Third, regarding the identified nuclear repressors, the authors should validate the interaction of NSP13 with the ones whose loss activates YAP transcriptional activity (Figure 4G). Lastly, why can't NSP13 bind TEAD4 in the cytoplasmic fractionation if both NSP13 and TEAD4 are detected there (Figure 3B)? This finding indicates their interaction is not a direct protein-protein interaction but is mediated by something in the nucleus, such as genomic DNA.

      (1) Low TEAD expression in HEK293T cells: Our IP-MS experiments were performed in HEK293T cells, which, according to the Human Protein Atlas, express TEAD1–4 at comparatively low levels (TEAD1: 16.5, TEAD2: 16.4, TEAD3: 4.9, TEAD4: 38.7 nTPM). In contrast, HeLa cells, where we successfully validated NSP13-mediated YAP suppression (Fig. 4H, Supplementary Fig.5B-D), show higher expression of these TEAD isoforms (TEAD1: 97.1, TEAD2: 27.3, TEAD3: 12.2, TEAD4: 48.1 nTPM). Therefore, insufficient TEAD abundance in HEK293T cells may limit the sensitivity needed to detect TEAD–NSP13 interactions in our proteomic screens.

      (2) Transience and potential DNA dependence: Our co-immunoprecipitation (co-IP) experiments (Fig. 4B, Supplementary Fig.4C-E) indicated that NSP13–TEAD4 binding is low-affinity. Under standard IP-MS conditions (which typically do not include chemical cross-linkers or nucleic acids to stabilize transient complexes), weak or short-lived interactions can be lost during washes or sample processing.

      (3). Additional supporting evidence: We carefully checked our IP-MS data and found that the well-known TEAD binding proteins, including CTBP1/2 and GATA4, were pulled down, suggesting TEAD’s absence does not rule out an NSP13–TEAD association.

      (3a) We acknowledge that our NSP13 immunoprecipitation–mass spectrometry (IP-MS) did not identify any TEAD proteins (Fig. 4G and IP-MS tables). Several factors likely contributed to this outcome:

      (3b) We sincerely appreciate the reviewer’s insightful suggestion. While we agree that mapping NSP13 occupancy at individual TEAD-binding motifs is valuable, we respectfully consider this to be beyond the scope of the current study. Biochemical and structural work on coronavirus NSP13 shows that it recognizes nucleic‑acid substrates primarily through their 5′ single‑stranded overhang and duplex architecture, not through a defined base sequence(2, 3). Accordingly, our data (Fig. 3B and 3F) indicate that DNA binding ability, rather than recognition of a specific motif, enables NSP13 to perform its helicase activity in proximity to TEAD and recruit repressors. Moreover, the DNA‑binding mutant K345A/K347A and the ATPase‑dead mutant R567A both fail to suppress YAP/TEAD transcription despite retaining the ability to interact with TEAD (Fig. 3B). These loss‑of‑function phenotypes demonstrate that NSP13’s chromatin engagement and unwinding activity, rather than sequence‑restricted targeting, are essential for repression. For these reasons, motif‑specific binding assays were not pursued in this revision, but we clarified in the discussion that NSP13’s DNA engagement is likely structural or TEAD-dependent, rather than sequence‑directed. We also highlighted this as an important avenue for future investigation.

      (3c) To validate the NSP13 interacting proteins from our IP-MS data, we generated plasmids expressing several candidates (CCT3, SMARCD1, EIF4A1, LMNA, TTF2, and YY2) and performed co-IP assays. As predicted, we confirmed the robust interaction between NSP13 and TEAD (Supplemental Fig. 5E). However, these putative nuclear repressors exhibited weak binding to NSP13 compared with TEAD4, suggesting that NSP13 associates with them indirectly, possibly as part of a larger multiprotein complex or depending on the chromatin structure, rather than via direct protein–protein interaction (Fig. 4J).

      (3d) We appreciate the reviewer’s question. To investigate whether their association might be DNA‐dependent, we performed co‐IP experiments using nuclear lysates in the presence or absence of various nucleases: Universal Nuclease (which degrades all forms of DNA and RNA), DNase I (which cleaves both single‐ and double‐stranded DNA), and RNase H (which selectively cleaves the RNA strand in RNA/DNA hybrids). Our findings revealed that nucleic acid removal did not disrupt the NSP13/TEAD4 interaction (Supplemental Fig.4E), indicating that their binding is not solely mediated by DNA or RNA.

      Reviewer #2 (Public Review):

      Specific comments and suggestions for improvement of the manuscript:

      (1) NSP13 has been reported to block, in a helicase-dependent manner, episomal DNA transcription (PMID: 37347173), raising questions about the effects observed on the data shown from the HOP-Flash and 8xGTIIC assays. It would be valuable to demonstrate the specificity of the proposed effect of NSP13 on TEAD activation by YAP (versus broad effects on reporter assays) and also to show that NSP13 reduces the function of endogenous YAP-TEAD transcriptional activity (i.e., does ectopic NSP13 expression reduce the expression of YAP induced TEAD target genes in cells).

      We appreciate the reviewer’s comments and have carefully revisited the conclusions from the published paper(4) (PMID: 37347173), which reported that NSP13 suppresses episomal DNA transcription, as evidenced by reduced Renilla luciferase (driven by the herpes simplex virus thymidine kinase promoter) and GFP expression upon co‐expression with NSP13. For our experiments, we used a dual‐luciferase assay with Renilla luciferase (under the same promoter) as an internal control. After re-examining our raw Renilla luciferase data (now provided in the supplemental Excel file “Supporting data value”), we found that while 100 ng of NSP13 did not affect Renilla luciferase levels, 400 ng of NSP13 reduced them by approximately 50% relative to the YAP5SA‐only group (Supplemental Fig.2B, Fig.3C-D). We observed a similar reduction with NSP13 truncation mutants—an outcome not fully consistent with the published study (Supplemental Fig.3D, PMID: 37347173). However, unlike their finding of robust episomal DNA suppression, our data indicate that the K345A/K347A mutant of NSP13, which lacks DNA‐binding ability, completely lost its suppressive effect (Fig.3B).

      We performed additional Notch reporter assays to address the concern that NSP13 might nonspecifically inhibit episomal DNA transcription (including the HOP‑Flash and 8×GTIIC reporters). These experiments revealed that co‑expression of NSP13 with NICD (Notch intracellular domain) does not suppress Notch signaling (Supplemental Fig. 2C), indicating that NSP13 does not globally block all reporter systems. To evaluate whether NSP13 reduces endogenous YAP‑TEAD activity, we transiently overexpressed NSP13 WT and its R567A mutant in HeLa cells. However, bulk RNA‑seq and qPCR analyses did not reveal a clear decrease in YAP target genes, possibly due to the low transfection efficiency (< 50%, Supplemental Fig.4D). Interestingly, we observed that YAP5SA was predominantly retained in the nucleus upon NSP13 or R567A co‑expression, suggesting that NSP13 (or together with its interacting partners) restricts YAP5SA cytoplasmic shuttling. Future studies will involve stable cell lines expressing NSP13 WT or R567A to better characterize the mechanisms driving YAP5SA nuclear retention and clarify how NSP13 specifically suppresses YAP activity.

      (2) While the IP-MS experiment may have revealed new regulators of TEAD activity, the data presented are preliminary and inconclusive. No interactions are validated and beyond slight changes in TEAD reporter activity following knockdown, no direct links to YAP-TEAD are demonstrated, and no link to NPS13 was shown. Also, no details are provided about the methods used for the IP-MS experiment, raising some concerns about potential false positive associations within the data.

      We appreciate the reviewer’s feedback regarding our IP-MS findings and acknowledge that additional validation is required to establish definitive links between the identified putative regulators, YAP-TEAD, and NSP13. We have taken the following steps (and plan further experiments) to address these concerns:

      (2a) Co-IP validation: Same with the answer for Reviewer #1 (3c), we generated plasmids expressing several top candidate interactors from the IP-MS data (CCT3, SMARCD1, EIF4A1, LMNA, TTF2, and YY2) and performed direct co-IP assays in a more controlled setting. The results indicated that these putative NSP13 interactors had weaker binding compared to TEAD4, implying that NSP13 may associate with them as part of a larger complex or depending on the chromatin structure rather than through a direct protein–protein interaction (Fig. 4J).

      (2b) qPCR validation: Beyond reporter assays for evaluating YAP transactivation after the candidate YAP suppressor knockdown (Fig. 4H and Supplemental Fig. 5C), we performed qPCR to detect YAP activation on endogenous YAP-TEAD target genes (e.g., CTGF CYR61, and AMOTL2) after CCT3 knockdown. Expression of CTGF and CYR61 was higher compared to control (Supplemental Fig. 5D), strengthening the case for an interaction relevant to YAP-TEAD signaling.

      (2c) To investigate how NSP13‐interacting proteins link to the YAP/TEAD complex, we examined the IP‑MS dataset and identified several well‐known YAP and TEAD binding partners, including CTBP1/2 (TEAD‐binding), GATA4 (TEAD‐binding), and multiple 14‐3‐3 isoforms (YWHAZ/YWHAB/YWHAH/YWHAQ, YAP binding). These findings suggest that NSP13 may form a larger nuclear complex with YAP/TEAD and associated cofactors. In the future, we will determine whether these putative TEAD regulators also interact with NSP13 under various conditions (e.g., in the presence or absence of DNA) and whether co‐expression of NSP13 influences their association with YAP or TEAD. This approach will clarify how NSP13 might leverage these factors to regulate YAP‐TEAD function.

      (2e) For the mass spectrometry experiments, HEK293T cells were transfected with Flag‐YAP1, HA‐NSP13, or Flag‐YAP1 + HA‐NSP13 according to the manufacturer’s standard protocols. After nuclear extraction and lysis, the supernatant was incubated with HA magnetic beads to immunoprecipitate (IP) NSP13. The IP samples were subsequently analyzed by mass spectrometry to identify NSP13‐associated proteins (Fig. 4F). Each experimental condition was performed in duplicate to ensure reproducibility. We included an appropriate negative control (Flag‐YAP1) and stringent data‐filtering criteria to minimize false positives. We apologize for not including these details in our original Methods section; in this revised manuscript, we have fully described the number of replicates, the controls used, and our data analysis pipelines.

    1. eLife Assessment

      The study presents valuable theoretical insights by attempting to classify pattern-forming gene subnetworks and exploring their potential mechanisms. However, the results are incomplete, as they rely on oversimplified models, limited classifications, and assumptions that may not hold in more complex or realistic scenarios.

    2. Reviewer #1 (Public review):

      Summary:

      The authors tackle a long-standing question in developmental theory: given a gene-regulatory network that includes extracellular signalling, which topologies are even capable of transforming an initial spatial profile into a genuinely new pattern? Building on the classical reaction-diffusion framework in one dimension, but imposing biologically motivated constraints, they prove that every one-signal sub-network must be either Hierarchical (H), self-activating (L+), or self-inhibiting (L-). They further demonstrate that only three composite classes of full networks - pure H, a coupled L+ L- "Turing" pair, and an L- module fed by an intracellular positive loop ("noise-amplifying")-can create non-trivial spatial transformations. Analytical criteria and illustrative simulations are provided, together providing a closed taxonomy, which is supposed to be relevant for real systems.

      Strengths:

      (1) Useful classification framework. Reducing a vast number of possible gene circuits to three canonical pattern-forming motifs is a valuable organising insight for both theorists and experimentalists.

      (2) Logical completeness. All required cases are addressed, and the proofs elevate previous computational observations to formal statements.

      (3) Practical interpretability. Given a reaction network diagram, one can now decide (assuming the model applies to the real systems) whether spatial patterning is even possible, saving experimental effort on in-silico screens that could never succeed.

      Weaknesses:

      (1) The Results section is difficult to follow. Key logical steps and network configurations are described shortly in prose, which constantly require the reader to address either SI or other parts of the text (see numerous links on the requirements R1-R5 listed at the beginning of the paper) to gain minimal understanding. As a result, a scientifically literate but non-specialist reader may struggle to grasp the argument with a reasonable time invested.

      (2) A central step in the model formulation is the linearisation of the reaction term around a homogeneous steady state; higher-order kinetics, including ubiquitous bimolecular sinks such as A + B → AB, are simply collapsed into the Jacobian without any stated amplitude bound on the perturbations. Because the manuscript never analyses how far this assumption can be relaxed, the robustness of the three-class taxonomy under realistic nonlinear reactions or large spike amplitudes remains uncertain.

      (3) All modelling is confined to one spatial dimension, and the very definition of a "non-trivial" transformation is framed in terms of peak positions along a line, which clearly must be reformulated for higher dimensions. It's well-known that diffusions in 1, 2, and 3 dimensions are also dramatically different, so the relevance of the three-class taxonomy to real multicellular tissues remains unclear, or at least should be explained in more detail.

      Discussion:

      As stated above, there are several uncertainties about the relevance of the presented framework for real systems. However, if the results hold, researchers could look at a gene-network diagram and quickly judge whether it can make spatial patterns and, if so, which of the three known mechanisms it will use. That shortcut would save experimental and computational time. In the case that the results don't hold for the real systems, the authors' proof tools at least give theorists a solid base they can extend to more complex cases.

    3. Reviewer #2 (Public review):

      Summary:

      This study explores how gene regulatory networks that include intra- and extracellular signaling can give rise to spatial patterns of gene expression in cells. The authors investigate this question in a simplified theoretical framework, where all cells are assumed to respond identically to signals, and spatial details such as cell boundaries and extensions are abstracted away. Within this setting, they identify three distinct signaling topologies, referred to as L and H types, and combine them into three minimal subnetworks capable of generating patterns. The study analyzes possible combinations of these topologies and examines how each subnetwork behaves under three different initial conditions. Combining the analyses with mathematical proofs and heuristic arguments, the authors define necessary conditions under which such networks can produce non-trivial spatial patterns.

      Strengths:

      The authors break down larger gene regulatory networks into smaller subnetworks, which allows for a more tractable analysis of pattern formation. These minimal subnetworks are examined under different initial conditions, providing a range of examples for how patterns can emerge in simplified settings. The study also proposes necessary conditions for pattern formation, which may be useful for identifying relevant network structures. In addition, the manuscript offers heuristic explanations for the emergence of patterns in each subnetwork, which help to interpret the simulation results and analytical criteria.

      Weaknesses:

      (1) We have serious concerns regarding the validity of the simulation results presented in the manuscript. Rather than simulating the full nonlinear system described by Equation (1), the authors base their results on a truncated expansion (Equation S.8.2) that captures only the time evolution of small deviations around a spatially homogeneous steady state. However, it remains unclear how this reduced system is derived from the full equations - specifically, which terms are retained or neglected and why - and how the expansion of the nonlinear function can be steady-state independent, as claimed. Additionally, in simulations involving the spike plus homogeneous initial condition, it is not evident - or, where equations are provided, it is not correct - that the assumed global homogeneous background actually corresponds to a steady state of the full dynamics. We elaborate on these concerns in the following:

      It is assumed that the homogeneous steady states are given by g_i=0 and g_i=c_i, where 1/c_i = \mu_i or \hat{\mu}_i​, independently of the specific network structure. However, the basis for this assumption is unclear, especially since some of the functions do not satisfy this condition - for example, f5​ as defined below Eq. S8.10.5. Moreover, if g_i=c_i does not correspond to a true steady state, then the time evolution of deviations from this state is not correctly described by Eq. S8.2, as the zeroth-order terms do not vanish in that case.

      Additionally, the equations used contain only linear terms and a cubic degradation term for each species g_i, while neglecting all quadratic terms and cubic terms involving cross-species interactions (i≠j). An explanation for this selective truncation is not provided, and without knowledge of the full equation (f), it is impossible to assess whether this expansion is mathematically justified. If, as suggested in the Supplementary Information, the linear and cubic terms are derived from f, then at the very least, the Jacobian matrix should depend on the background steady-state concentration. However, the equations for the small deviation around a steady state (including the Jacobian matrix) used in the simulations appear to be independent of the particular steady state concentration.

      This is why we believe that the differences observed between the spike-only initial condition and the spike superimposed on a homogeneous background are not due to the initial conditions themselves, but rather result from a modified reaction scheme introduced through a questionable cutoff.

      "In simulations with spike initial patterns, the reference value g≡0 represents an actual concentration of 0 and therefore, we must add to (S8.2) a Heaviside function Φ acting of f (i.e., Φ(f(g))=f(g) if f(g)>0 , Φ(f(g))=0 if f(g){less than or equal to}0 ) to prevent the existence of negative concentrations for any gene product (i.e., g_i<0 for some i )." (SI chapter S8).

      This cutoff alters the dynamics (no inhibition) and introduces a different reaction scheme between the two simulations. The need for this correction may itself reflect either a problem in the original equations (which should fulfill the necessary conditions and prevent negative concentrations (R4 in main text)) or the inappropriateness of using an expanded approximation which assumes independence on the steady state concentration. It is already questionable if the linearized equations with a cubic degradation term are valid for the spike initial conditions (with different background concentration values), as the amplitude of this perturbation seems rather large.

      Lastly, we note that under the current simulation scheme, it is not possible to meaningfully assess criteria RH2a and RH2b, as they rely on nonlinear interactions that are absent from the implemented dynamics.

      (2) Most of the proofs presented in the Supplementary Information rely on linearized versions of the governing equations, and it remains unclear how these results extend to the fully nonlinear system. We are concerned that the generality of the conclusions drawn from the linear analysis may be overstated in the main text. For example, in Section S3, the authors introduce the concept of dynamic equivalence of transitive chains (Proposition S3.1) and intracellular transitive M-branching (Proposition S3.2), which pertains to the system's steady-state behavior. However, the proof is based solely on the linearized equations, without additional justification for why the result should hold in the presence of nonlinearities. Moreover, the linearized system is used to analyze the response to a "spike initial pattern of arbitrary height C" (SI Chapter S5.1), yet it is not clear how conclusions derived from the linear regime can be valid for large perturbations, where nonlinear effects are expected to play a significant role. We encourage the authors to clarify the assumptions under which the linearized analysis remains valid and to discuss the potential limitations of applying these results to the nonlinear regime.

      (3) Several statements in the main text are presented without accompanying proof or sufficient explanation, which makes it difficult to assess their validity. In some cases, the lack of justification raises serious doubts about whether the claims are generally true. Examples are:

      "For the purpose of clarity we will explain our results as if these cells have a simple arrangement in space (e.g., a 1D line or a 2D square lattice) but, as we will discuss, our results shall apply with the same logic to any distribution of cells in space." (Main text l.145-l.148).

      "For any non-trivial pattern transformation (as long as it is symmetric around the initial spike), there exists an H gene network capable of producing it from a spike initial pattern." (Main text l.366f).

      "In 2D there are no peaks but concentric rings of high gene product concentration centered around the spike, while in 3D there are concentric spherical shells." (Main text l. 447ff).

      (4) The study identifies one-signal networks and examines how combinations of these structures can give rise to minimal pattern-forming subnetworks. However, the analysis of the combinations of these minimal pattern-forming subnetworks remains relatively brief, and the manuscript does not explore how the results might change if the subnetworks were combined in upstream and downstream configurations. In our view, it is not evident that all possible gene regulatory networks can be fully characterized by these categories, nor that the resulting patterns can be reliably predicted. Rather, the approach appears more suited to identifying which known subnetworks are present within a larger network, without necessarily capturing the full dynamics of more complex configurations.

      (5) The definition of non-trivial pattern formation is provided only in the Supplementary Information, despite its central importance for interpreting the main results. It would significantly improve clarity if this definition were included and explained in the main text. Additionally, it remains unclear how the definition is consistently applied across the different initial conditions. In particular, the authors should clarify how slope-based measures are determined for both the random noise and sharp peak/step function initial states. Furthermore, the authors do not specify how the sign function is evaluated at zero. If the standard mathematical definition sgn(0)=0 is used, then even a simple widening of a peak could fulfill the criterion for non-trivial pattern transformation.

      (6) The manuscript lacks a clear and detailed explanation of the underlying model and its assumptions. In particular, it is not well-defined what constitutes a "cell" in the context of the model, nor is it justified why spatial features of cells - such as their size or boundaries - can be neglected. Furthermore, the concept of the extracellular space in the one-dimensional model remains ambiguous, making it unclear which gene products are assumed to diffuse.

    4. Reviewer #3 (Public review):

      Pattern formation is responsible for generating the spatial organization of cells, tissues, and organs during embryogenesis. It operates within a multifactorial system including initial conditions, gene regulatory networks, extracellular signals, mechanical forces, stochastic noise, and environmental inputs. Finally, it ensures the functional anatomy of an organism.

      This study focuses on the one central aspect in pattern formation: how spatial heterogeneity arises from an initial condition and evolves into a more complex or distinct spatial pattern (non-trivial pattern formation, as they termed). The authors made efforts to explore and characterize all possible ways to achieve the pattern formation. They do this by discussing how extracellular signals spread, how individual cells respond to those signals, and how those responses, in turn, modulate signal propagation.

      Finally, their comprehensive analysis summarizes that there are three classes of interactions between extracellular signals and intracellular responses, corresponding to previously known mechanisms that can generate spatial patterns: difference in morphogen concentrations in space, noise-amplification, and Turing pattern.

    1. eLife Assessment

      This study presents a sequence-based method for predicting drug-interacting residues in intrinsically disordered proteins (IDPs), addressing an important challenge in understanding small-molecule:IDP interactions. The findings have solid support in illustrative examples that underscore the role of aromatic interactions. While predicted binding sites remain coarse, validation was done on a total of 10 IDPs, four of which thoroughly and six others less so. The method builds on previous work from the authors, with necessarily ad hoc modifications, and offers a starting point for further exploration in this emerging field.

    2. Reviewer #1 (Public review):

      Summary:

      The authors developed a sequence-based method to predict drug-interacting residues in IDP, based on their recent work, to predict the transverse relaxation rates (R2) of IDP trained on 45 IDP sequences and their corresponding R2 values. The discovery is that the IDPs interact with drugs mostly using aromatic residues that are easy to understand, as most drugs contain aromatic rings. They validated the method using several case studies, and the predictions are in accordance with chemical shift perturbations and MD simulations. The location of the predicted residues serves as a starting point for ligand optimization.

      Strengths:

      This work provides the first sequence-based prediction method to identify potential drug-interacting residues in IDP. The validity of the method is supported by case studies. It is easy to use, and no time-consuming MD simulations and NMR studies are needed.

      Weaknesses:

      The method does not depend on the information of binding compounds, which may give general features of IDP-drug binding. However, due to the size and chemical structures of the compounds (for example, how many aromatic rings), the number of interacting residues varies, which is not considered in this work. Lacking specific information may restrict its application in compound optimization, aiming to derive specific and potent binding compounds.

    3. Reviewer #2 (Public review):

      Summary:

      In this work, the authors introduce DIRseq, a fast, sequence-based method that predicts drug-interacting residues (DIRs) in IDPs without requiring structural or drug information. DIRseq builds on the authors' prior work looking at NMR relaxation rates, and presumes that those residues that show enhanced R2 values are the residues that will interact with drugs, allowing these residues to be nominated from the sequence directly. By making small modifications to their prior tool, DIRseq enables the prediction of residues seen to interact with small molecules in vivo.

      Strengths:

      The preprint is well written and easy to follow

      Weaknesses:

      (1) The DIRseq method is based on SeqDYN, which itself is a simple (which I do not mean as a negative - simple is good!) statistical predictor for R2 relaxation rates. The challenge here is that R2 rates cover a range of timescales, so the physical intuition as to what exactly elevated R2 values mean is not necessarily consistent with "drug interacting". Presumably, the authors are not using the helix boost component of SeqDYN here (it would be good to explicitly state this). This is not necessarily a weakness, but I think it would behove the authors to compare a few alternative models before settling on the DIRseq method, given the somewhat ad hoc modifications to SeqDYN to get DIRseq.

      Specifically, the authors previously showed good correlation between the stickiness parameter of Tesei et al and the inferred "q" parameter for SeqDYN; as such, I am left wondering if comparable accuracy would be obtained simply by taking the stickiness parameters directly and using these to predict "drug interacting residues", at which point I'd argue we're not really predicting "drug interacting residues" as much as we're predicting "sticky" residues, using the stickiness parameters. It would, I think, be worth the authors comparing the predictive power obtained from DIRseq with the predictive power obtained by using the lambda coefficients from Tesei et al in the model, local density of aromatic residues, local hydrophobicity (note that Tesei at al have tabulated a large set of hydrophobicity scores!) and the raw SeqDYN predictions. In the absence of lots of data to compare against, this is another way to convince readers that DIRseq offers reasonable predictive power.

      (2) Second, the DIRseq is essentially SeqDYN with some changes to it, but those changes appear somewhat ad hoc. I recognize that there is very limited data, but the tweaking of parameters based on physical intuition feels a bit stochastic in developing a method; presumably (while not explicitly spelt out) those tweaks were chosen to give better agreement with the very limited experimental data (otherwise why make the changes?), which does raise the question of if the DIRseq implementation of SeqDYN is rather over-parameterized to the (very limited) data available now? I want to be clear, the authors should not be critiqued for attempting to develop a model despite a paucity of data, and I'm not necessarily saying this is a problem, but I think it would be really important for the authors to acknowledge to the reader the fact that with such limited data it's possible the model is over-fit to specific sequences studied previously, and generalization will be seen as more data are collected.

      (3) Third, perhaps my biggest concern here is that - implicit in the author's assumptions - is that all "drugs" interact with IDPs in the same way and all drugs are "small" (motivating the change in correlation length). Prescribing a specific lengthscale and chemistry to all drugs seems broadly inconsistent with a world in which we presume drugs offer some degree of specificity. While it is perhaps not unexpected that aromatic-rich small molecules tend to interact with aromatic residues, the logical conclusion from this work, if one assumes DIRseq has utility, is that all IDRs bind drugs with similar chemical biases. This, at the very least, deserves some discussion.

      (4) Fourth, the authors make some general claims in the introduction regarding the state of the art, which appear to lack sufficient data to be made. I don't necessarily disagree with the author's points, but I'm not sure the claims (as stated) can be made absent strong data to support them. For example, the authors state: "Although an IDP can be locked into a specific conformation by a drug molecule in rare cases, the prevailing scenario is that the protein remains disordered upon drug binding." But is this true? The authors should provide evidence to support this assertion, both examples in which this happens, and evidence to support the idea that it's the "prevailing view" and specific examples where these types of interactions have been biophysically characterized.

      Similarly, they go on to say:

      "Consequently, the IDP-drug complex typically samples a vast conformational space, and the drug molecule only exhibits preferences, rather than exclusiveness, for interacting with subsets of residues." But again, where is the data to support this assertion? I don't necessarily disagree, but we need specific empirical studies to justify declarative claims like this; otherwise, we propagate lore into the scientific literature. The use of "typically" here is a strong claim, implying most IDP complexes behave in a certain way, yet how can the authors make such a claim?

      Finally, they continue to claim:

      "Such drug interacting residues (DIRs), akin to binding pockets in structured proteins, are key to optimizing compounds and elucidating the mechanism of action." But again, is this a fact or a hypothesis? If the latter, it must be stated as such; if the former, we need data and evidence to support the claim.

    4. Author response:

      Reviewer #1 (Public review):

      Summary:

      The authors developed a sequence-based method to predict drug-interacting residues in IDP, based on their recent work, to predict the transverse relaxation rates (R2) of IDP trained on 45 IDP sequences and their corresponding R2 values. The discovery is that the IDPs interact with drugs mostly using aromatic residues that are easy to understand, as most drugs contain aromatic rings. They validated the method using several case studies, and the predictions are in accordance with chemical shift perturbations and MD simulations. The location of the predicted residues serves as a starting point for ligand optimization.

      Strengths:

      This work provides the first sequence-based prediction method to identify potential drug-interacting residues in IDP. The validity of the method is supported by case studies. It is easy to use, and no time-consuming MD simulations and NMR studies are needed.

      Weaknesses:

      The method does not depend on the information of binding compounds, which may give general features of IDP-drug binding. However, due to the size and chemical structures of the compounds (for example, how many aromatic rings), the number of interacting residues varies, which is not considered in this work. Lacking specific information may restrict its application in compound optimization, aiming to derive specific and potent binding compounds.

      We fully recognize that different compounds may have different interaction propensity profiles along the IDP sequence. In future studies, we will investigate compound-specific parameter values. The limiting factor is training data, but such data are beginning to be available.

      Reviewer #2 (Public review):

      Summary:

      In this work, the authors introduce DIRseq, a fast, sequence-based method that predicts drug-interacting residues (DIRs) in IDPs without requiring structural or drug information. DIRseq builds on the authors' prior work looking at NMR relaxation rates, and presumes that those residues that show enhanced R2 values are the residues that will interact with drugs, allowing these residues to be nominated from the sequence directly. By making small modifications to their prior tool, DIRseq enables the prediction of residues seen to interact with small molecules in vivo.

      Strengths:

      The preprint is well written and easy to follow

      Weaknesses:

      (1) The DIRseq method is based on SeqDYN, which itself is a simple (which I do not mean as a negative - simple is good!) statistical predictor for R2 relaxation rates. The challenge here is that R2 rates cover a range of timescales, so the physical intuition as to what exactly elevated R2 values mean is not necessarily consistent with "drug interacting". Presumably, the authors are not using the helix boost component of SeqDYN here (it would be good to explicitly state this). This is not necessarily a weakness, but I think it would behove the authors to compare a few alternative models before settling on the DIRseq method, given the somewhat ad hoc modifications to SeqDYN to get DIRseq.

      Actually, the factors that elevate R2 are well-established. These are local interactions and residual secondary structures (if any). The basic assumption of our method is that intra-IDP interactions that elevate R2 convert to IDP-drug interactions. This assumption was supported by our initial observation that the drug interaction propensity profiles predicted using the original SeqDYN parameters already showed good agreement with CSP profiles. We only made relatively small adjustments to the parameters to improve the agreement. Indeed we did not apply the helix boost portion of SeqDYN to DIRseq, and will state as such. We will also compare DIRseq with several alternative models.

      Specifically, the authors previously showed good correlation between the stickiness parameter of Tesei et al and the inferred "q" parameter for SeqDYN; as such, I am left wondering if comparable accuracy would be obtained simply by taking the stickiness parameters directly and using these to predict "drug interacting residues", at which point I'd argue we're not really predicting "drug interacting residues" as much as we're predicting "sticky" residues, using the stickiness parameters. It would, I think, be worth the authors comparing the predictive power obtained from DIRseq with the predictive power obtained by using the lambda coefficients from Tesei et al in the model, local density of aromatic residues, local hydrophobicity (note that Tesei at al have tabulated a large set of hydrophobicity scores!) and the raw SeqDYN predictions. In the absence of lots of data to compare against, this is another way to convince readers that DIRseq offers reasonable predictive power.

      We will compare predictions of these various parameter sets, and summarize the results in a table.

      (2) Second, the DIRseq is essentially SeqDYN with some changes to it, but those changes appear somewhat ad hoc. I recognize that there is very limited data, but the tweaking of parameters based on physical intuition feels a bit stochastic in developing a method; presumably (while not explicitly spelt out) those tweaks were chosen to give better agreement with the very limited experimental data (otherwise why make the changes?), which does raise the question of if the DIRseq implementation of SeqDYN is rather over-parameterized to the (very limited) data available now? I want to be clear, the authors should not be critiqued for attempting to develop a model despite a paucity of data, and I'm not necessarily saying this is a problem, but I think it would be really important for the authors to acknowledge to the reader the fact that with such limited data it's possible the model is over-fit to specific sequences studied previously, and generalization will be seen as more data are collected.

      We have explained the rationale for the parameter tweaks, which were limited to q values for four amino-acid types, i.e., to deemphasize hydrophobic interactions and slightly enhance electrostatic interactions (p. 4-5). We will add that these tweaks were motivated by observations from MD simulations of drug interactions with a-syn (ref 20). As already noted in the response to the preceding comment, we will also present results for the original parameter values as well as for when the four q values are changed one at a time.

      (3) Third, perhaps my biggest concern here is that - implicit in the author's assumptions - is that all "drugs" interact with IDPs in the same way and all drugs are "small" (motivating the change in correlation length). Prescribing a specific lengthscale and chemistry to all drugs seems broadly inconsistent with a world in which we presume drugs offer some degree of specificity. While it is perhaps not unexpected that aromatic-rich small molecules tend to interact with aromatic residues, the logical conclusion from this work, if one assumes DIRseq has utility, is that all IDRs bind drugs with similar chemical biases. This, at the very least, deserves some discussion.

      The reviewer raises a very important point. In Discussion, we will add that it is important to further develop DIRseq to include drug-specific parameters when data for training become available.

      (4) Fourth, the authors make some general claims in the introduction regarding the state of the art, which appear to lack sufficient data to be made. I don't necessarily disagree with the author's points, but I'm not sure the claims (as stated) can be made absent strong data to support them. For example, the authors state: "Although an IDP can be locked into a specific conformation by a drug molecule in rare cases, the prevailing scenario is that the protein remains disordered upon drug binding." But is this true? The authors should provide evidence to support this assertion, both examples in which this happens, and evidence to support the idea that it's the "prevailing view" and specific examples where these types of interactions have been biophysically characterized.

      We will cite several studies showing that IDPs remain disordered upon drug binding.

      Similarly, they go on to say:

      "Consequently, the IDP-drug complex typically samples a vast conformational space, and the drug molecule only exhibits preferences, rather than exclusiveness, for interacting with subsets of residues." But again, where is the data to support this assertion? I don't necessarily disagree, but we need specific empirical studies to justify declarative claims like this; otherwise, we propagate lore into the scientific literature. The use of "typically" here is a strong claim, implying most IDP complexes behave in a certain way, yet how can the authors make such a claim? 

      Here again we will add citations to support the statement.

      Finally, they continue to claim:

      "Such drug interacting residues (DIRs), akin to binding pockets in structured proteins, are key to optimizing compounds and elucidating the mechanism of action." But again, is this a fact or a hypothesis? If the latter, it must be stated as such; if the former, we need data and evidence to support the claim. 

      We will add citations to both compound optimization and mechanism of action.

    1. eLife Assessment

      This work presents valuable new information on the microtubule-binding mode of the microtubule kinesin-13, MCAK, the authors use quantitative single-molecule studies to propose that MCAK preferentially binds to a GDP-Pi-tubulin portion of the microtubule end. However, the evidence provided to support this claim remains incomplete and would benefit from more rigorous methodology particularly the diffraction limited experiments do not provide sufficient spatial resolution to support the authors' conclusions. In addition, a more through discussion of the existing literature would further strengthen the manuscript.

    2. Reviewer #1 (Public review):

      The authors responded to multiple criticisms with additional data and more detailed statistics, in some instances improving the quality of the work. However, I had difficulty understanding some of the authors' responses. The logic was not always apparent, the writing was occasionally confusing or would benefit from more careful wording, and some of the provided responses were superficial or raised new concerns. In some cases, the underlying data needed to support their responses were not shown. Thus, the current version of the manuscript does not sufficiently resolve the following critical issues raised by myself and other reviewers.

      (1) A clear new insight into a physiological process or cellular behavior remains lacking. The study largely confirms prior observations of MCAK binding to both the microtubule wall and end. However, it is still unclear whether direct binding to the tip-as opposed to accumulation via wall diffusion or interaction with other tip-binding proteins-is a significant mechanism.

      (2) The newly revealed adenosine-nucleotide-dependent binding preferences do not help clarify MCAK's catalytic function or its mechanisms of tip recognition. Consequently, the final summary figure remains speculative and is not convincingly supported by the data. It is also unclear what exactly is meant by the "working model" (figure title), or by the claim of "a simple rule of how the end-binding regulators coordinate their activities" (abstract).

      (3) As noted in my previous review, the effects of adding different adenosine nucleotides on MCAK binding to microtubules are much more pronounced than the differences in MCAK binding to tubulin with various guanosine-containing nucleotides, or to lattice versus tip (e.g., Fig. 5E). Therefore, the manuscript title-"MCAK recognizes the nucleotide-dependent feature at growing microtubule ends"-does not do justice to the scale of these effects.

      (4) The title implies that MCAK selectively recognizes a feature determined by the tubulin-bound guanosine nucleotide. However, the authors frequently claim that MCAK binds to the "entire GTP cap." It appears that they exclude structural protrusions from their definition of the cap, which is debatable. Even using their definition, the conclusion that MCAK recognizes a specific "nucleotide-dependent feature" seems inconsistent with the claim that it binds uniformly across the cap. These distinctions were not made clear.

      (5) Some important technical details are still absent. For example, when reading the authors' response to another reviewer's question, I could not find an explanation of how the kon values for end and wall binding were calculated. These calculations clearly require assumptions, e.g. about the number of binding sites, but these details are not described. In addition, the binding data are expressed in units per tubulin dimer, which are non-standard and make comparisons to other published results difficult. There are other instances where more technical detail would be desirable, but they are too numerous to list here.

      (6) Several aspects of data presentation as graphs will make it difficult for other researchers to analyze or interpret the findings. Numerical Excel-style data sheets should be provided for all measurements, including raw data-not just the ratios or derived values shown in plots. Other, more significant issues include use of mean values for non-Gaussian distributions (e.g., dwell times); binding affinities inferred from single-concentration measurements, often under varying conditions (e.g., Figs. 3C, 4); and absence of side-by-side plotted controls (e.g., Fig. 6).

      (7) While the authors have added some quantitative values and descriptive detail, the manuscript still lacks a critical comparison of their findings with existing literature. This weakens the impact of the study and limits the reader's ability to place the results in a broader context.

    3. Reviewer #4 (Public review):

      The revised manuscript from Chen et al. implements many of the changes requested by the 3 reviewers of the initial submission. These changes are well-described in the corresponding Response to Reviews document. Of course, not every request from the reviewers was addressed, and the following major concerns remain:

      (1) The authors argue that MCAK binds to the same region as EB proteins, which they refer to as the "EB cap". Reviewers asked for experiments that would increase the size of the EB cap to create "comets" (e.g. by increasing the microtubule growth rate); the prediction is that the MCAK signal should increase in size as well. The authors declined to pursue these experiments. As a result, the EB signals and MCAK signals are diffraction-limited spots, as opposed to the predicted exponential decay signals characteristic of EB comets. The various diffraction-limited spots are then aligned with the diffraction-limited signal of the microtubule end. These alignments and sub-pixel comparisons are technically challenging. The revised manuscript does not go far enough to provide compelling evidence that all technical challenges were overcome. Thus, while the authors can safely conclude that MCAK, EBs, and the microtubule end do occupy the same diffraction-limited spot, more precise conclusions are not supported.

      (2) The reviewers criticized the initial manuscript for neglecting key references, particularly Kinoshita et al., Science 2001. Indeed, I cannot fathom writing a manuscript about MCAK and XMAP215 without putting a citation to such a landmark paper front and center. The authors have responded by including more discussion of the relevant literature (and citing Kinoshita et al.). However, the revised manuscript is often still cursory in giving credit where credit is due, contextualizing the new data, and generally engaging with the scholarship on MCAK.

      (3) The data presented does not include a simple measurement of the impact of MCAK on the catastrophe frequency of microtubules. The authors explain this absence by pointing out that their movies are short (5 min) and high frame rate (10 fps). While I understand that such imaging parameters are necessary to capture single molecule end-binding events, I do not understand why a separate set of experiments could not be performed. This type of "positive control" is often missing, as pointed out by the 3 reviewers.

      (4) Salt conditions, protein concentrations, and other key experimental parameters are not varied, even when varying them would provide excellent tests of the authors' hypotheses.

      In summary, the revised manuscript is improved in many ways, but the interested reader should look carefully at the previous reviews and compare the measurements presented here with those of other labs.

    1. eLife Assessment

      By taking advantage of noise in gene expression, this important study introduces a new approach for detecting directed causal interactions between two genes without perturbing either. The main theoretical result is supported by a proof. Preliminary simulations and experiments on small circuits are solid, but further investigations are needed to demonstrate the broad applicability and scalability of the method.

    2. Reviewer #2 (Public Review):

      Summary:

      This paper describes a new approach to detecting directed causal interactions between two genes without directly perturbing either gene. To check whether gene X influences gene Z, a reporter gene (Y) is engineered into the cell in such a way that (1) Y is under the same transcriptional control as X, and (2) Y does not influence Z. Then, under the null hypothesis that X does not affect Z, the authors derive an equation that describes the relationship between the covariance of X and Z and the covariance of Y and Z. Violation of this relationship can then be used to detect causality.

      The authors benchmark their approach experimentally in several synthetic circuits. In 4 positive control circuits, X is a TetR-YFP fusion protein that represses Z, which is an RFP reporter. The proposed approach detected the repression interaction in 2 of the 4 positive control circuits. The authors constructed 16 negative control circuit designs in which X was again TetR-YFP, but where Z was either a constitutively expressed reporter, or simply the cellular growth rate. The proposed method detected a causal effect in two of the 16 negative controls, which the authors argue is perhaps not a false positive, but due to an unexpected causal effect. Overall, the data support the potential value of the proposed approach.

      Strengths:

      The idea of a "no-causality control" in the context of detected directed gene interactions is a valuable conceptual advance that could potentially see play in a variety of settings where perturbation-based causality detection experiments are made difficult by practical considerations.

      By proving their mathematical result in the context of a continuous-time Markov chain, the authors use a more realistic model of the cell than, for instance, a set of deterministic ordinary differential equations.

      The authors have improved the clarity and completeness of their proof compared to a previous version of the manuscript.

      Limitations:

      The authors themselves clearly outline the primary limitations of the study: The experimental benchmark is a proof of principle, and limited to synthetic circuits involving a handful of genes expressed on plasmids in E. coli. As acknowledged in the Discussion, negative controls were chosen based on the absence of known interactions, rather than perturbation experiments. Further work is needed to establish that this technique applies to other organisms and to biological networks involving a wider variety of genes and cellular functions. It seems to me that this paper's objective is not to delineate the technique's practical domain of validity, but rather to motivate this future work, and I think it succeeds in that.

      Might your new "Proposed additional tests" subsection be better housed under Discussion rather than Results?

      I may have missed this, but it doesn't look like you ran simulation benchmarks of your bootstrap-based test for checking whether the normalized covariances are equal. It would be useful to see in simulations how the true and false positive rates of that test vary with the usual suspects like sample size and noise strengths.

      It looks like you estimated the uncertainty for eta_xz and eta_yz separately. Can you get the joint distribution? If you can do that, my intuition is you might be able to improve the power of the test (and maybe detect positive control #3?). For instance, if you can get your bootstraps for eta_xz and eta_yz together, could you just use a paired t-test to check for equality of means?

      The proof is a lot better, and it's great that you nailed down the requirement on the decay of beta, but the proof is still confusing in some places:

      On pg 29, it says "That is, dividing the right equation in Eq. 5.8 with alpha, we write the ..." but the next equation doesn't obviously have anything to do with Eq. 5.8, and instead (I think) it comes from Eq 5.5. This could be clarified.

      Later on page 29, you write "We now evoke the requirement that the averages xt and yt are stationary", but then you just repeat Eq. 5.11 and set it to zero. Clearly you needed the limit condition to set Eq. 5.11 to zero, but it's not clear what you're using stationarity for. I mean, if you needed stationarity for 5.11 presumably you would have referenced it at that step.

      It could be helpful for readers if you could spell out the practical implications of the theorem's assumptions (other than the no-causality requirement) by discussing examples of setups where it would or wouldn't hold.

    3. Author response:

      The following is the authors’ response to the previous reviews

      We have made the following small adjustments and resubmit the manuscript to be published as a Version of Record with eLife.

      Changes in main text of the manuscript:

      We have moved the “Proposed additional tests” subsection to the Discussion section as suggested by the referee. 

      We have added a link to a Github repository and a link to a Zenodo data repository at the beginning of the Materials and Methods section in the “Data and materials availability” subsection. The Github repository contains simulation code and data, and single-cell data analysis code. The Zenodo link contains our experimental data (we await your confirmation before we publish it officially on Zenodo).   

      Changes in the supplemental information files

      We have fixed the typo on page 29 of the SI in which Eq. (8) was referred to in a derivation. It should be Eq. (5) instead. We thank the referee for catching this mistake which has now been corrected.

      We have fixed a typo on page 29 of SI, in which the word “evoke” is now “invoke”.  

      We have clarified the derivation on page 29 of the SI. The referee is correct that the limit condition was used to set the right-hand side of Eq. (5.11) to zero.

    1. eLife Assessment

      This important study reports an advancement in the diagnosis of Animal African Trypanosomosis (AAT), which adapts a CRISPR-based diagnostic tool (SHERLOCK4AAT) to detect different trypanosome species responsible for AAT. The evidence supporting the conclusions is convincing and in line with the current state-of-the-art diagnostics. This study will be of interest to the fields of Epidemiology, Public Health, and Veterinary Medicine.

    2. Reviewer #1 (Public review):

      Summary:

      The authors developed SHERLOCK4AAT, a CRISPR-Cas13a-based diagnostic toolbox for detecting multiple trypanosome species responsible for animal African trypanosomiasis. They created species-specific assays targeting six prevalent parasite species and validated the system using dried blood spots from domestic pigs in Guinea and Côte d'Ivoire. Field testing revealed high infection rates (62.7% of pigs infected) and, notably, the presence of human-infective parasites in domestic animals.

      Major Strengths:

      This study represents a valuable application of CRISPR-based detection technology to veterinary diagnostics, with strong potential for practical implementation. The authors conducted comprehensive validation, including statistical analyses to determine sensitivity and specificity, and demonstrated field utility through large-scale testing of 424 samples from two geographically distinct regions. The detection of human-infective parasites in pigs at both sites provides important One Health insights supporting integrated disease surveillance and has direct implications for public health policy and disease elimination programs. The methodology is robust, incorporating Bayesian statistical modeling and offering clear practical advantages such as dried blood spot compatibility and detection of active infections. The revised manuscript also addresses implementation considerations, including cost, training needs, and field logistics.

      Major Weaknesses:

      Some technical limitations constrain broader applicability. The assay for one key parasite species (T. vivax) shows suboptimal sensitivity, which may limit its utility in detecting this important pathogen. The current assay design does not distinguish between closely related species within the same subgenus-an important factor for certain epidemiological studies. Additionally, some assays relied on synthetic controls due to unavailable biological material, and the discussion on potential cross-reactivity with related kinetoplastid parasites is limited.<br /> Achievement of Aims: The authors clearly achieved their primary objectives of developing a sensitive, species-specific diagnostic system and demonstrating its applicability in real-world settings. The detection of human-infective trypanosomes in domestic pigs provides valuable epidemiological evidence in support of One Health strategies and targeted disease elimination efforts.

      Impact and Utility:

      This work responds to a well-documented need in veterinary diagnostics, where current methods often lack sensitivity or species discrimination. The system offers practical benefits for resource-limited settings through a short assay duration and compatibility with dried blood spot samples. While certain performance limitations may restrict broader adoption, the species identification capability represents a substantial advancement over existing approaches. The findings enhance our understanding of parasite diversity in livestock and their potential role as zoonotic reservoirs, with implications extending beyond veterinary medicine to public health surveillance and policy development.

      Context:

      This study makes a timely and relevant contribution to diagnostic epidemiology and One Health surveillance frameworks. The field-adapted use of advanced molecular detection technologies represents a significant step toward improved disease monitoring in regions where trypanosomiasis poses ongoing threats to animal health, agriculture, and human livelihoods. The cross-disciplinary implications for veterinary medicine, public health, and disease elimination programs underscore the broader significance of this work.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript is fundamental due to the significance of its findings. The strength of the evidence is compelling, and the manuscript is publishable since the corrections have been made.

      Strengths:

      Using a Novel SHERLOCK4AAT toolkit for diagnosis.

      Identification of various sub-species of Trypanosomes.

      Differentiating the animal sub-species from the human one.

      Corrections Made:

      Definite articles have been removed from the title.

      The words of the title have been reduced to 15.

      Typographical errors have been corrected.

      Weaknesses:

      None

    4. Reviewer #3 (Public review):

      Summary:

      The study adapts CRISPR-based detection toolkit (SHERLOCK assay) using conserved and species-specific targets for the detection of some members of the Trypanosomatidae family of veterinary importance and species-specific assays to differentiate between the six most common animal trypanosomes species responsible for AAT (SHERLOCK4AAT). The assays were able to discriminate between Trypanozoon (T. b. brucei, T. evansi and T. equiperdum), T. congolense (Savanah, Forest Kilifi and Dzanga sangha), T. vivax, T. theileri, T. simiae and T. suis. The design of both broad and species-specific assays was based primarily on sequences of the 18S rRNA, GAPDH (Glyceraldehyde-3-phosphate dehydrogenase) and invariant flagellum antigen (IFX) genes for species identification. Most importantly the authors showed varying limit of detection for the different SHERLOCK assays which is somewhat comparable to PCR-derived molecular techniques currently used for detecting animal trypanosomes even though some of these methodologies have used other primers that target genes such as ITS1 and 7SL sRNA.

      The data presented in the study are particularly useful and of significant interest for diagnosis of AAT in affected areas.

      Strengths:

      The assays convincingly allow for the analysis and detection of most trypanosomes in AAT

      Weaknesses:

      Inability for the assay to distinguish T. b. brucei, T. evansi and T. equiperdum using the 18S rRNA gene as well as the IFX gene not achieving the sensitivity requirements for detection of T. vivax. Both T. brucei brucei and T. vivax are the most predominant infective species in animals (in addition to T. congolense), therefore a reliable assay should be able to convincingly detect these to allow for proper use of diagnostic assay.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review): 

      Summary: 

      This study addresses a critical gap in veterinary diagnostics by developing a CRISPR-based diagnostic toolbox (SHERLOCK4AAT) for detecting animal African trypanosomosis. It describes the development and field deployment of SHERLOCK4AAT, a CRISPR-Cas13-based diagnostic toolbox for the eco-epidemiological surveillance of animal African trypanosomosis (AAT) in West Africa.The authors successfully created and validated species-specific assays for multiple trypanosomes, including T. congolense, T. vivax, T. theileri, T. simiae, and T. suis, alongside pan-trypanosomatid and pan-Trypanozoon assays. The field validation in pigs from Guinea and Côte d'Ivoire revealed high trypanosome prevalence (62.7%), frequent co-infections, and importantly identified T. b. gambiense in one animal at each site, suggesting pigs may serve as potential reservoirs for this human-infective parasite.

      A major strength of the study lies in its methodological innovation. By adapting SHERLOCK to target both conserved and species-discriminating sequences, the authors achieved high sensitivity and specificity in detecting Trypanosoma species. Their use of dried blood spots, validated thresholds through ROC analyses, and statistical robustness (e.g., Bayesian latent class modeling) provides a strong foundation for their conclusions.

      The results are significant: over 60% of pigs tested positive for at least one trypanosome species, with co-infections observed frequently and T. b. gambiense detected in pigs at both sites. These findings have direct implications for the role of animal reservoirs in human disease transmission and underscore the value of pigs as sentinel hosts in gHAT elimination efforts.

      The limitations are well acknowledged, particularly the suboptimal sensitivity of the T. vivax assay and the reliance on synthetic controls for T. suis and T. simiae. However, these limitations do not undermine the overall conclusions, and the paper provides a clear roadmap for further assay refinement and implementation.

      This study offers a timely, impactful, and well-substantiated contribution to the field. The SHERLOCK4AAT toolbox holds promise for improving AAT diagnostics in resource-limited settings and advancing One Health surveillance frameworks.

      Thank you

      Strengths: 

      (1) The adaptation of SHERLOCK technology for AAT represents a significant technical advancement, offering higher sensitivity than traditional parasitological methods and the ability to detect multiple species simultaneously.

      (2) Rigorously performed with validation using appropriate controls, ROC curve analyses, and Bayesian latent class modelling, establishing clear analytical sensitivity and specificity for most assays.

      (3) Testing 424 pig samples across two countries provides robust evidence of the tool's utility and reveals important epidemiological insights about trypanosome diversity and prevalence.

      (4) The identification of T. b. gambiense in pigs at both sites has significant implications for HAT elimination strategies and highlights the need for integrated One Health approaches.

      (5) The use of dried blood spots and RNA detection for active infections makes the approach practical for field surveillance in resource-limited settings.

      Thank you

      Weaknesses: 

      (1) The manuscript would benefit from more detailed discussion of practical considerations such as cost, equipment requirements, and training needs for implementing SHERLOCK in endemic areas and rural settings which would improve applicability.

      This is now adressed in the revised discussion (end of the first section).

      (2) Limited discussion of pig selection criteria: More justification for choosing pigs as sentinel animals and discussion of potential limitations of this approach would strengthen the manuscript.

      Yes, this is now more clearly explained in the revised discussion (beginning of the first section).

      (3) More details on why certain genes were targeted would strengthen the methods.

      The first result section ‘Selection of targets for broad and species-specific SHERLOCK assays targeting AAT species (SHERLOCK4AAT)’ is already dedicated to extensively explaining target selection, hence we’re afraid we don’t know what could be added.  

      (4) Table formatting could be improved for readability. 

      (5) Some figures are complex and would benefit from additional explanations in the legends.

      We have tried to improve these two aspects as much as possible in the revised manuscript.

      Reviewer #2 (Public review): 

      Summary: 

      The manuscript is important due to the significance of the findings. The strength of evidence is convincing.

      Thank you

      Strengths: 

      (1) Using a Novel SHERLOCK4AAT toolkit for diagnosis. 

      (2) Identification of various sub-species of Trypanosomes. 

      (3) Differentiating the animal subspecies from the human one. 

      Thank you

      Weaknesses: 

      (1) The title is too long, and the use of definite articles should be reduced in the title.

      The title has been improved in the revised version.

      (2) The route of blood sample collection in the animals should be well defined and explained.

      This has been more clearly explained in the revised method section.

      Reviewer #3 (Public review):

      Summary: 

      The study adapts CRISPR-based detection toolkit (SHERLOCK assay) using conserved and species-specific targets for the detection of some members of the Trypanosomatidae family of veterinary importance and species-specific assays to differentiate between the six most common animal trypanosome species responsible for AAT (SHERLOCK4AAT). The assays were able to discriminate between Trypanozoon (T. b. brucei, T. evansi, and T. equiperdum), T. congolense (Savanah, Forest Kilifi, and Dzanga sangha), T. vivax, T. theileri, T. simiae, and T. suis. The design of both broad and species-specific assays was based primarily on sequences of the 18S rRNA, GAPDH (Glyceraldehyde-3-phosphate dehydrogenase), and invariant flagellum antigen (IFX) genes for species identification. Most importantly, the authors showed varying limits of detection for the different SHERLOCK assays, which is somewhat comparable to PCR-derived molecular techniques currently used for detecting animal trypanosomes, even though some of these methodologies have used other primers that target genes such as ITS1 and 7SL sRNA. <br /> The data presented in the study are particularly useful and of significant interest for the diagnosis of AAT in affected areas.

      Thank you

      Strengths: 

      The assays convincingly allow for the analysis and detection of most trypanosomes in AAT.

      Thank you

      Weaknesses: 

      Inability for the assay to distinguish T. b. brucei, T. evansi, and T. equiperdum using the 18S rRNA gene, as well as the IFX gene, not achieving the sensitivity requirements for detection of T. vivax.  Both T. brucei brucei and T. vivax are the most predominant infective species in animals (in addition to T. congolense), therefore, a reliable assay should be able to convincingly detect these to allow for proper use of the diagnostic assay.

      We agree with this point and aim to improve the toolbox for future studies.

      Reviewer #1 (Recommendations for the authors):

      (1) Provide additional details on the practicality of SHERLOCK deployment in the field, including training, costs, and infrastructure (potential challenges for field deployment, including suggestions for how to overcome these barriers).

      This is now adressed in the revised discussion (end of the first section).

      (2) Provide more detailed justification for choosing pigs as the main study species and discuss potential benefits and limitations of extending the approach to other livestock species.

      Yes, this is now more clearly explained in the revised discussion (beginning of the first section).

      (3) Add a comparison table comparing SHERLOCK4AAT performance metrics (sensitivity, specificity, LoD) with existing molecular diagnostic methods for AAT for ease of reference.

      There are dozens of different serological, immunological and molecular approaches with highlty variable levels of sensitivity and specificities already reviewed and compared in detail in two references from 2022 (Desquesnes et al. a and b), which we have cited, as well as in a newly added reference (EBHODAGHE F acta trop 2018). Hence, we decided to only refer to the most comparable studies in the present article.

      (4) Review complex figures and improve legends for better readability and interpretation.

      We have tried to improve this as much as possible in the revised manuscript.

      Reviewer #2 (Recommendations for the authors): 

      (1) Reduce the number of words in the title from 28 to not more than 20.

      The title has been improved in the revised version.

      (2) Specify the particular route of collection of blood samples in the various animals.

      Yes, this is now more clearly explained in the revised method section.

      (3) Correct all typographical errors. 

      We have tried to improve this as much as possible in the revised manuscript.

      Thanks. I wish you the best in your publication process. 

      Thank you

      Reviewer #3 (Recommendations for the authors): 

      Minor comments 

      (1) The authors can expand the discussion to include other recent diagnostic assays for Animal trypanosomiasis, such as those that target other genes like tubulin.

      Please see response to Review 1 point #3 above.

      (2) The cost-effectiveness of the use of the assay can be discussed since the assay is expected to be used for work in some resource-deprived areas. For example, will it cost a researcher less to do a diagnosis with this assay relative to what is already available?

      This is now adressed in the revised discussion (end of the first section).

      (3) Is Cote d'Ivoire more endemic for AAT than Guinea? Will this account for the apparently consistent differences in the percentage of positive samples, or just because of the type of samples used from the two locations?

      As the sampling method, sample preservation and sample analysis were the same for both groups - yes, it appears that pigs, at least for domesticated ones, in the study region of Cote d'Ivoire were more frequently infected than those in the study region of Guinea. It is however risky to extrapolate these observations to the AAT prevalence in the entire countries and/or to other mammals.

      (4) Can the authors comment on how long one can store the samples for an effective and reliable assay?

      The samples can be stored for several months at ambient temperature in a sealed bag with silica gel packages to reduce humidity. We have added this detail in the revised methods section.

      (5) It is not clear whether the authors used conventional molecular diagnostics to compare the data obtained from this particular cohort of animals as reference is made to published data. It is not surprising that the SHERLOCK performed better than using parasitology-based methodology.

      This is now adressed in the revised discussion.

      (6) (Figure 4D-5D) should be 4D and 5D.

      Thank you, this has been corrected.

    1. eLife Assessment

      This useful study integrates experimental methods from materials science with psychophysical methods to investigate how frictional stabilities influence tactile surface discrimination. The authors argue that force fluctuations arising from transitions between frictional sliding conditions facilitate the discrimination of surfaces with similar friction coefficients. However, the reliance on friction data from an artificial finger, combined with correlational analyses that fall short of establishing a mechanistic link to perception, renders the findings incomplete.

    2. Reviewer #2 (Public review):

      This is a revised version of a paper I reviewed previously.

      Again, the purpose of the paper is to suggest that common metrics, such as friction or any given physical property of the surface, are probably inadequate to predict the perception of the surface or its discriminability. Instead, the authors propose a very interesting and original idea that, instead, frictional instabilities are related to fine touch perception (title).

      Overall, the authors have put much effort into improving the manuscript, enhancing clarity, and avoiding overstatements. And I feel the narrative is indeed much improved and less ambiguous.

      However, the authors have systematically avoided addressing the main comment of all reviewers: the link made between the mock finger passive experiment and the active human psychophysics is incorrect and should not be done, because its interpretation could be flawed.<br /> - First, this link is very weak (the correlation of 6 datapoints is barely significant).<br /> - Second, the real and mock fingers have very different properties (think about moisture, compliance, roughness,...).<br /> - Third, the comparison is made between a passive and well-controlled experiment and an active exploration. Yet, the comparison metrics (number of events) are clearly dependent on exploration procedures.

      In your response to my comments:<br /> "We have made changes throughout the manuscript to acknowledge that our findings are correlative, clarifying this throughout, and incorporating into the discussion how our work may enable biomechanical measurements and tactile decision making models"

      The authors admit that the analysis is flawed, yet they did not remove it. If they cannot demonstrate that the mock finger and the human finger behave the same way during the perceptual experiment, then they should remove Fig2 that combines apples and oranges. OR, they should look at the active exploration data and compute the same metrics on that data.

      "This "weird choice" is the central innovation of this paper. This choice was necessary because we demonstrated that the common usage of friction coefficient is fundamentally flawed: we see that friction coefficient suggests that surface which are more different would feel more similar - indeed the most distinctive surfaces would be two surfaces that are identical, which is clearly spurious. "

      They did not "demonstrate" such a flaw. Again, the difference in friction is between the mock finger trials. At the very least, the authors should verify that it is true of the active human experiment.

      "To fully implement this, a decision-making model is necessary because, as a counter example, a participant could have generated 10 swipes of SFW and 1 swipe of a Sp, but the Sp may have been the most important event for making a tactile decision. This type of scenario is not compatible with the analysis suggested - and similar counterpoints can be made for other types of seemingly straightforward analysis."

      The suggested analyses are straightforward and would be much more valuable than the data from the mock finger, even with the potential variability stated above.

      "We recognize that, with all factors being equal, this sample size is on the smaller end"

      Yet, the authors did not collect additional data to confirm their findings.

    3. Reviewer #3 (Public review):

      Strengths:

      The paper describes a new perspective on friction perception, with the hypothesis that humans are sensitive to the instabilities of the surface rather than the coefficient of friction. The paper is very well written and with a comprehensive literature survey.

      One of the central tools used by the author to characterize the frictional behavior is the frictional instabilities maps. With these maps, it becomes clear that two different surfaces can have both similar and different behavior depending on the normal force and the speed of exploration. It puts forward that friction is a complicated phenomenon, especially for soft

      The psychophysics study is centered around an odd-one-out protocol, which has the advantage of avoiding any external reference to what would mean friction or texture for example. The comparisons are made only based on the texture being similar or not.

      The results show a significant relationship between the distance between frictional maps and the success rate in discriminating two kinds of surface.

      Weaknesses:

      The main weakness of the paper comes from the fact that the frictional maps and the extensive psychophysics study are not made at the same time, nor with the same finger. The frictional maps are produced with an artificial finger made out of PDMS which is a poor substitute for the complex tribological properties of skin.

      The evidence would have been much stronger if the measurement of the interaction was done during the psychophysical experiment. In addition, because of the protocol, the correlation is based on aggregates rather than on individual interactions. However the current data already bring new light on the nature of frictional oscillation and their link to perception.

      The authors compensate with a third experiment where they used a 2AFC protocol and an online force measurement. But the results of this third study fail to solidify the relation.

      No map of the real finger interaction is shown, bringing doubt to the validity of the frictional map for something as variable as human fingers.

    4. Reviewer #4 (Public review):

      Summary:

      In this paper, Derkaloustian et al. look at the important topic of what affects fine touch perception. The observations that there may be some level of correlation with instabilities are intriguing. They attempted to characterize different materials by counting the frequency (occurrence #, not of vibration) of instabilities at various speeds and forces of a PDMS slab pulled lengthwise over the material. They then had humans perform the same vertical motion to discriminate between these samples. They correlated the % correct in discrimination with differences in frequency of steady sliding over the design space as well as other traditional parameters such as friction coefficient and roughness.

      The authors pose an interesting hypothesis and make an interesting observation about the occurrences of instability regimes in different materials while in contact with PDMS, which is interesting for the community to see in publication. It should be noted however that the finger is complex, and there are many factors that may be over simplified, and perhaps even incorrect, with the use of the PDMS finger. There are trends, such as the trend of surfaces that are more similar in PDMS friction coefficient being easier to discriminate than those with more different PDMS friction coefficient, that contradict multiple other papers in the literature (Fehlberg et al., 2024; Smith and Scott, 1996). This may be due to the PDMS finger not being representative of the real finger conditions. A measurement of friction and the instabilities with a human finger, or demonstration that the PDMS finger is producing the same results (friction coefficient, instabilities, etc.) as a human finger, is needed.

      Strengths:

      The strength of this paper is in its intriguing hypothesis and important observation that instabilities may contribute to what humans are detecting as differences in these apparently similar samples.

      Weaknesses:

      There is are significant weaknesses in the representativeness of the PDMS finger, the vertical motion, and the speed of sliding to real human exploration. The real finger has multiple layers with different moduli. In fact, the stratum corneum cells, which are the outer layer at the interface and determine the friction, have much higher modulus than PDMS. In addition, the flat contact area can cause shifting of contact points. Both can contribute to making the PDMS finger have much more stick slip than a real finger. In fact, if you look at the regime maps, there is very little space that has steady sliding. This does not represent well human exploration of surfaces. We do not tend to use force and velocity that will cause extensive stick slip (frequent regions of 100% stick slip) and, in fact, the speeds used in the study are on the slow side, which also contributes to more stick slip. At higher speeds and lower forces, all of the materials had steady sliding regions. Further, on these very smooth surfaces, the friction and stiction are more complex and cannot dismiss considerations such as finger material property change with sweat pore occlusion and sweat capillary forces. Also, the vertical motion of both the PDMS finger and the instructed human subjects is not the motion that humans typically use to discriminate between surfaces.

      This all leads to the critical question, why is the friction, normal force, and velocity not measured during the measured human exploration using the real human finger? An alternative would be showing that the PDMS finger reproduces the results of the human finger. I have checked the author's previous papers with this setup and did not find one that showed that the PDMS finger produced the same results as a human finger (Carpenter et al., 2018; Dhong et al., 2018; Nolin et al., 2022, 2021). The reviewer is not asking to do a more detailed psychophysical study with a decision-making model. All that is being asked is to use a human finger for the friction coefficient and instability measurements at typical human forces and speeds, or at least doing these measurements with both for one or two samples to show that the PDMS finger produces the same results as a human finger. The authors posed an extremely interesting hypothesis that humans may alter their speed to feel the instability transition regions. This is something that could be measured with a real finger but is not likely to be correlated accurately enough to match regime boundaries determined with such a simplified artificial finger.

      References

      Carpenter CW, Dhong C, Root NB, Rodriquez D, Abdo EE, Skelil K, Alkhadra MA, Ramírez J, Ramachandran VS, Lipomi DJ. 2018. Human ability to discriminate surface chemistry by touch. Mater Horiz 5:70-77. doi:10.1039/C7MH00800G<br /> Dhong C, Kayser LV, Arroyo R, Shin A, Finn M, Kleinschmidt AT, Lipomi DJ. 2018. Role of fingerprint-inspired relief structures in elastomeric slabs for detecting frictional differences arising from surface monolayers. Soft Matter 14:7483-7491. doi:10.1039/C8SM01233D<br /> Fehlberg M, Monfort E, Saikumar S, Drewing K, Bennewitz R. 2024. Perceptual Constancy in the Speed Dependence of Friction During Active Tactile Exploration. IEEE Transactions on Haptics 17:957-963. doi:10.1109/TOH.2024.3493421<br /> Nolin A, Licht A, Pierson K, Lo C-Y, Kayser LV, Dhong C. 2021. Predicting human touch sensitivity to single atom substitutions in surface monolayers for molecular control in tactile interfaces. Soft Matter 17:5050-5060. doi:10.1039/D1SM00451D<br /> Nolin A, Pierson K, Hlibok R, Lo C-Y, Kayser LV, Dhong C. 2022. Controlling fine touch sensations with polymer tacticity and crystallinity. Soft Matter 18:3928-3940. doi:10.1039/D2SM00264G<br /> Smith AM, Scott SH. 1996. Subjective scaling of smooth surface friction. Journal of Neurophysiology 75:1957-1962. doi:10.1152/jn.1996.75.5.1957

    1. eLife Assessment

      This valuable simulation study proposes a new coarse-grained model to explain the effects of CpG methylation on nucleosome wrapping energy and nucleosome positioning. The evidence to support the claims in the paper looks solid and this work will be of interest to the researchers working on gene regulation and mechanisms of DNA methylation.

    2. Reviewer #1 (Public review):

      Summary:

      In this manuscript, the authors used a coarse-grained DNA model (cgNA+) to explore how DNA sequences and CpG methylation/hydroxymethylation influence nucleosome wrapping energy and the probability density of optimal nucleosomal configuration. Their findings indicate that both methylated and hydroxymethylated cytosines lead to increased nucleosome wrapping energy. Additionally, the study demonstrates that methylation of CpG islands increases the probability of nucleosome formation.

      Strengths:

      The major strength of this method is the model explicitly includes phosphate group as DNA-histone binding site constraints, enhancing CG model accuracy and computational efficiency and allowing comprehensive calculations of DNA mechanical properties and deformation energies.

      Weaknesses:

      A significant limitation of this study is that the parameter sets for the methylated and hydroxymethylated CpG steps in the cgNA+ model are derived from all-atom molecular dynamics (MD) simulations that use previously established force field parameters for modified cytosines (Pérez A, et al. Biophys J. 2012; Battistini, et al. PLOS Comput Biol. 2021). These parameters suggest that both methylated and hydroxymethylated cytosines increase DNA stiffness and nucleosome wrapping energy, which could predispose the coarse-grained model to replicate these findings. Notably, conflicting results from other all-atom MD simulations, such as those by Ngo T in Nat. Commun. 2016, shows that hydroxymethylated cytosines increase DNA flexibility, contrary to methylated cytosines. If the cgNA+ model were trained on these later parameters or other all-atom MD force fields, different conclusions might be obtained regarding the effects of methylated and hydroxymethylation on nucleosome formation.

      Despite the training parameters of the cgNA+ model, the results presented in the manuscript indicate that methylated cytosines increase both DNA stiffness and nucleosome wrapping energy. However, when comparing nucleosome occupancy scores with predicted nucleosome wrapping energies and optimal configurations, the authors find that methylated CGIs exhibit higher nucleosome occupancies than unmethylated ones, which seems to contradict the expected relationship where increased stiffness should reduce nucleosome formation affinity. In the manuscript, the authors also admit that these conclusions "apparently runs counter to the (perhaps naive) intuition that high nucleosome forming affinity should arise for fragments with low wrapping energy". Previous all-atom MD simulations (Pérez A, et al. Biophys J. 2012; Battistini, et al. PLOS Comput Biol. 202; Ngo T, et al. Nat. Commun. 20161) show that the stiffer DNA upon CpG methylation reduces the affinity of DNA to assemble into nucleosomes or destabilizes nucleosomes. Given these findings, the authors need to address and reconcile these seemingly contradictory results, as the influence of epigenetic modifications on DNA mechanical properties and nucleosome formation are critical aspects of their study.

      Understanding the influence of sequence-dependent and epigenetic modifications of DNA on mechanical properties and nucleosome formation is crucial for comprehending various cellular processes. The authors' study, focusing on these aspects, definitely will garner interest from the DNA methylation research community.

      Comments on revised version:

      The authors have addressed most of my comments and concerns regarding this manuscript.

    3. Reviewer #2 (Public review):

      Summary:

      This study uses a coarse-grained model for double stranded DNA, cgNA+, to assess nucleosome sequence affinity. cgNA+ coarse-grains DNA on the level of bases and accounts also explicitly for the positions of the backbone phosphates. It has been proven to reproduce all-atom MD data very accurately. It is also ideally suited to be incorporated into a nucleosome model because it is known that DNA is bound to the protein core of the nucleosome via the phosphates.

      It is still unclear whether this harmonic model parametrized for unbound DNA is accurate enough to describe DNA inside the nucleosome. Previous models by other authors, using more coarse-grained models of DNA, have been rather successful in predicting base pair sequence dependent nucleosome behavior. This is at least the case as long as DNA shape is concerned whereas assessing the role of DNA bendability (something this paper focuses on) has been consistently challenging in all nucleosome models to my knowledge.

      It is thus of major interest whether this more sophisticated model is also more successful in handling this issue. As far as I can tell the work is technically sound and properly accounts for not only the energy required in wrapping DNA but also entropic effects, namely the change in entropy that DNA experiences when going from the free state to the bound state. The authors make an approximation here which seems to me to be a reasonable first step.

      Of interest is also that the authors have the parameters at hand to study the effect of methylation of CpG-steps. This is especially interesting as this allows to study a scenario where changes in the physical properties of base pair steps via methylation might influence nucleosome positioning and stability in a cell-type specific way.

      Overall, this is an important contribution to the questions of how sequence affects nucleosome positioning and affinity. The findings suggest that cgNA+ has something new to offer. But the problem is complex, also on the experimental side, so many questions remain open. Despite of this, I highly recommend publication of this manuscript.

      Strengths:

      The authors use their state-of-the-art coarse grained DNA model which seems ideally suited to be applied to nucleosomes as it accounts explicitly for the backbone phosphates.

      Weaknesses:

      The authors introduce penalty coefficients c_i to avoid steric clashes between the two DNA turns in the nucleosome. This requires c_i-values that are so high that standard deviations in the fluctuations of the simulation are smaller than in the experiments.

    4. Reviewer #3 (Public review):

      Summary:

      In this study, authors utilize biophysical modeling to investigate differences in free energies and nucleosomal configuration probability density of CpG islands and nonmethylated regions in the genome. Toward this goal, they develop and apply the cgNA+ coarse-grained model, an extension of their prior molecular modeling framework.

      Strengths:

      The study utilizes biophysical modeling to gain mechanistic insight into nucleosomal occupancy differences in CpG and nonmethylated regions in the genome.

      Weaknesses:

      Although the overall study is interesting, the manuscripts need more clarity in places. Moreover, the rationale and conclusion for some of the analyses are not well described.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, the authors used a coarse-grained DNA model (cgNA+) to explore how DNA sequences and CpG methylation/hydroxymethylation influence nucleosome wrapping energy and the probability density of optimal nucleosomal configuration. Their findings indicate that both methylated and hydroxymethylated cytosines lead to increased nucleosome wrapping energy. Additionally, the study demonstrates that methylation of CpG islands increases the probability of nucleosome formation.

      Strengths:

      The major strength of this method is that the model explicitly includes elastic constraints on the positions of phosphate groups facing a histone octamer, as DNA-histone binding site constraints. The authors claim that their model enhances the accuracy and computational efficiency and allows comprehensive calculations of DNA mechanical properties and deformation energies.

      Weaknesses:

      A significant limitation of this study is that the parameter sets for the methylated and hydroxymethylated CpG steps in the cgNA+ model are derived from all-atom molecular dynamics (MD) simulations that suggest that both methylated and hydroxymethylated cytosines increase DNA stiffness and nucleosome wrapping energy (P´erez A, et al. Biophys J. 2012; Battistini, et al. PLOS Comput Biol. 2021). It could predispose the coarse-grained model to replicate these findings. Notably, conflicting results from other all-atom MD simulations, such as those by Ngo T in Nat. Commun. 2016, shows that hydroxymethylated cytosines increase DNA flexibility, contrary to methylated cytosines. If the cgNA+ model was trained on these later parameters or other all-atom force fields, different conclusions might be obtained regarding the effects of methylated and hydroxymethylation on nucleosome formation.

      Despite the training parameters of the cgNA+ model, the results presented in the manuscript indicate that methylated cytosines increase both DNA stiffness and nucleosome wrapping energy. However, when comparing nucleosome occupancy scores with predicted nucleosome wrapping energies and optimal configurations, the authors find that methylated CGIs exhibit higher nucleosome occupancies than unmethylated ones, which seems to contradict their findings from the same paper which showed that increased stiffness should reduce nucleosome formation affinity. In the manuscript, the authors also admit that these conclusions “apparently runs counter to the (perhaps naive) intuition that high nucleosome forming affinity should arise for fragments with low wrapping energy”. Previous all-atom MD simulations (P´erez A, et al. Biophys J. 2012; Battistini, et al. PLOS Comput Biol. 202; Ngo T, et al. Nat. Commun. 20161) show that the stiffer DNA upon CpG methylation reduces the affinity of DNA to assemble into nucleosomes or destabilizes nucleosomes. Given these findings, the authors need to address and reconcile these seemingly contradictory results, as the influence of epigenetic modifications on DNA mechanical properties and nucleosome formation are critical aspects of their study. Understanding the influence of sequence-dependent and epigenetic modifications of DNA on mechanical properties and nucleosome formation is crucial for comprehending various cellular processes. The authors’ study, focusing on these aspects, will definitely garner interest from the DNA methylation research community.

      Training the cgNA+ model on alternative MD simulation datasets is certainly of interest to us. However, due to the significant computational cost, this remains a goal for future work. The relationship between nucleosome occupancy scores and nucleosome wrapping energy is still debated, with conflicting findings reported in the literature, as noted in our Discussion section. Interestingly, we find that our predicted log probability density of DNA spontaneously acquiring a nucleosomal configuration is a better indicator of nucleosome occupancy than our predicted DNA nucleosome wrapping energy.

      Reviewer #2 (Public Review):

      Summary:

      This study uses a coarse-grained model for double-stranded DNA, cgNA+, to assess nucleosome sequence affinity. cgNA+ coarse-grains DNA on the level of bases and accounts also explicitly for the positions of the backbone phosphates. It has been proven to reproduce all-atom MD data very accurately. It is also ideally suited to be incorporated into a nucleosome model because it is known that DNA is bound to the protein core of the nucleosome via the phosphates.

      It is still unclear whether this harmonic model parametrized for unbound DNA is accurate in describing DNA inside the nucleosome. Previous models by other authors, using more coarse-grained models of DNA, have been rather successful in predicting base pair sequence-dependent nucleosome behavior. This is at least the case as far as DNA shape is concerned whereas assessing the role of DNA bendability (something this paper focuses on) has been consistently challenging in all nucleosome models, to my knowledge.

      It is thus of major interest whether this more sophisticated model is also more successful in handling this issue. As far as I can tell the work is technically sound and properly accounts for not only the energy required in wrapping DNA but also entropic effects, namely the change in entropy that DNA experiences when going from the free state to the bound state. The authors make an approximation here which seems to me to be a reasonable first step.

      Of interest is also that the authors have the parameters at hand to study the effect of methylation of CpG-steps. This is especially interesting as it allows us to study a scenario where changes in the physical properties of base pair steps via methylation might influence nucleosome positioning and stability in a cell-type-specific way.

      Overall, this is an important contribution to the question of how the sequence affects nucleosome positioning and affinity. The findings suggest that cgNA+ has something new to offer. But the problem is complex, also on the experimental side, so many questions remain open.

      Strengths:

      The authors use their state-of-the-art coarse-grained DNA model which seems ideally suited to be applied to nucleosomes as it accounts explicitly for the backbone phosphates.

      Weaknesses:

      (1) According to the abstract the authors consider two “scalar measures of the sequence-dependent propensity of DNA to wrap into nucleosomes”. One is the bending energy and the other, is the free energy. Specifically in the latter, the authors take the difference between the free energies of the wrapped and the free DNA. Whereas the entropy of the latter can be calculated exactly, they assume that the bound DNA always has the same entropy (independent of sequence) in its more confined state. The problem is the way in which this is written (e.g. below Eq. 6) which is hard to understand. The authors should mention that the negative of Eq. 6 is what physicists call free energy, namely especially the free energy difference between bound and free DNA.

      We have included the necessary clarifications in the revised manuscript, below Eq. 6.

      (2) In Eq. 5 the authors introduce penalty coefficients c<sub>i</sub>. They write that values are “set by numerical experiment to keep distances ... within the ranges observed in the PDB structure, while avoiding sterical clashes in DNA.” This is rather vague, especially since it is unclear to me what type of sterical clashes might occur. Figure 1 shows then a comparison between crystal structures and simulated structures. They are reasonably similar but standard deviations in the fluctuations of the simulation are smaller than in the experiments. Why did the authors not choose smaller c<sub>i</sub>-values to have a better fit? Do smaller values lead to unwanted large fluctuations that would lead to steric clashes between the two DNA turns? I also wonder what side views of the nucleosomes look like (experiments and simulations) and whether in this side view larger fluctuations of the phosphates can be observed in the simulation that would eventually lead to turn-turn clashes for smaller c<sub>i</sub>-values.

      The side view plots of the experimental and predicted nucleosome structures are now added to Supplementary material (Figure S8). Indeed, smaller c<sub>i</sub> values lead to steric clashes between the two turns of DNA – this is now specified in the Methods section. A possible improvement of our optimisation method and a direction of future work would be adding a penalty which prevents steric clashes to the objective function. Then the c<sub>i</sub> values could be reduced to have bigger fluctuations that are even closer to the experimental structures. We added this explanation to the Results section.

      Reviewer #3 (Public Review):

      Summary:

      In this study, the authors utilize biophysical modeling to investigate differences in free energies and nucleosomal configuration probability density of CpG islands and nonmethylated regions in the genome. Toward this goal, they develop and apply the cgNA+ coarse-grained model, an extension of their prior molecular modeling framework.

      Strengths:

      The study utilizes biophysical modeling to gain mechanistic insight into nucleosomal occupancy differences in CpG and nonmethylated regions in the genome.

      Weaknesses:

      Although the overall study is interesting, the manuscripts need more clarity in places. Moreover, the rationale and conclusion for some of the analyses are not well described.

      We edited the manuscript according to the reviewer’s suggestions and hopefully improved its readability.

      Reviewer #1 (Recommendations For The Authors):

      (1) The cgNA+ model parameters are derived from all-atom molecular dynamics (MD) simulations, yet there is no consensus within all-atom MD simulations regarding the impact of CpG methylation on DNA mechanical properties. The authors could consider fitting the coarsegrained model with a different all-atom force field to verify whether the conclusions regarding the effects of methylation and hydroxymethylation on DNA nucleosome wrapping energies still hold. For further details on MD simulations related to CpG methylation effects, the authors are advised to consult the review paper by Li et al. (2022) titled “DNA methylation: Precise modulation of chromatin structure and dynamics” published in Current Opinion in Structural Biology.

      Parametrizing the cgNA+ model using MD simulations with various force fields is certainly of interest to us. However, due to the computational cost involved, it remains a goal for future work.

      (2) Beyond DNA mechanical properties, which are directly linked to nucleosome wrapping energies in this study, the authors might also consider other factors such as geometric properties that could influence nucleosome formation. This approach might help the authors to reconcile the observed higher nucleosome occupancy scores for methylated CpGs. The authors are encouraged to review the aforementioned paper for additional experimental and MD simulation studies that could support this perspective.

      Geometric properties of DNA are directly incorporated into our method through the cgNA+ model equilibrium shape prediction µ. We compute the mechanical energy needed deform µ to a nucleosomal configuration. Notably, the equilibrium shape µ is sensitive to methylation, as demonstrated in Figure 3.

      (3) There are some issues with citation accuracy in the manuscript. For instance, in the Discussion section, the authors attribute a statement to Collings et al. and Anderson (2017), claiming that “methylated regions, known to have high wrapping energy, are among the highest nucleosome occupied elements in the genome.” However, upon reviewing this paper, it appears that it does not make any claims about the high wrapping energy of methylated regions.

      The paragraph is now edited and a separate citation, P´erez et al. (2012), is given for the statement that methylation regions have high wrapping energy.

      Reviewer #2 (Recommendations For The Authors):

      Please improve the readability by:

      (1) making clear that -ln ρ in Eq. 6 on page 4 is actually the free energy. Also, the word entropy comes too late (on page 7) where the best explanation of Eq. 6 is presented.

      We added a comment about -ln ρ being the free energy after Eq. 6 and also included an equation, relating ln ρ and entropy.

      (2) page 12 and 13 show two sets of experimental data. They are quite different from each other. When reading this, I wondered why there is this difference. But only on page 16, you explain that these are different cell types. The difference should be explained already when the papers are introduced on page 12.

      A corresponding sentence already appeared in page 12: “The observations about nucleosome occupancy should be regarded as preliminary, and be treated with caution, as they are based on experimental data obtained for the cancerous HeLa cells Schwartz et al. (2019) and human genome embryonic stem cells Yazdi et al. (2015)”. Now we also added this information to the first paragraph of the subsection for clarity.

      Finally, I add here some general thoughts that came up when reading the paper, comparing your findings with earlier findings in the field. This is not a strict one-to-one comparison and thus does not have to find its way into this manuscript but might give ideas for future studies. Experiments suggest that nucleosomes prefer DNA with a high content of C’s and G’s. Figure 2 does not look at the GC content but at the number of CpG’s. But in any case, let’s use this as a proxy for GC content. Figure 2a suggests that there is not a strong dependence of the bending energy on the number CpG steps. This is consistent with earlier work with the rigid basepair model which shows the same behavior for GC content (for both MD and crystal parametrizations). Figure 2c (related to the negative free energy) shows that with an increasing number of CpG steps the propensity to bind goes down. This suggests that the entropic cost to confine CpG-rich DNA increases, which in turn reflects that these DNA stretches are softer. This is rather interesting since in the case of the rigid basepair model this effect is observed only when stiffnesses are extracted from crystal data not MD data (however, this refers again to CG content). This might indicate a difference between the rigid bp model and cgNA+ which will be interesting to study in the future. Interesting is also the effect of CpG methylation. The stiffer methylated steps lead to an increase in the energy with the number of such steps (Figure 2a). The entropic cost for binding is thus expected to be smaller and this is indeed observed in Figure 2c when compared to the non-methylated steps.

      We thank the reviewer for this comment. As for the GC content, the energy and lnp plots are indeed very similar to those in Figure 2.

      Reviewer #3 (Recommendations For The Authors):

      (1) The formulation of the cgNA+ model in the method section was not easy to follow and can be described better to improve clarity.

      We have revised the model description and hope that its clarity has been improved

      (2) The authors mention utilizing 100 human genome sequences with 100 configurations from DB. It would be helpful to clarify the source of these 100 human genome sequences. Are these 100 distinct regions on the human reference genome, or are they from a specific dataset or database?

      We now include an explanation about the origin of sequences: “The human genome sequences are a random subset of our sequence sample for the CGI and NMI intersection in the Chromosome 1, but the following observations remain unchanged for sequence samples from different genomic regions.”

      (3) The authors mention the lack of tail unwrapping in their model. It would be beneficial to understand the magnitude of this issue and its potential impact on the overall results. How significant is the lack of unwrapping events in their current model?

      We observed the unwrapping of approximately five base-pairs at each end of our predicted nucleosome configurations, in comparison to the experimental configurations (Figure 1). This issue could be solved by adding additional constraints at the ends of the 147 bp sequence. The wrapping energy would increase marginally, as only about 10 of 147 bp would be affected. We added this remark to the main text.

      (4) Observations from Figure 3 are not described properly. Are these differences statistically significant? Why is twist higher for CpG sites but lower for a roll?

      We added an explanation of how the statistics was computed into the caption of Figure 3. In fact, we didn’t use statistical estimates here, but generated all the possible cases and computed the exact statistics (for the given set of our model parameters). Regarding the changes in twist and roll, we have added the following comment on page 7: “The ground state changes resulting from cytosine modifications – primarily characterized by an average increase in roll and a decrease in twist – may be linked to steric hindrance caused by the cytosine 5-substituent (Battistini et al. (2021)). Notably, the negative coupling between twist and roll has already been observed in X-ray crystallography data (Olson et al. (1998)).”

      (5) Figure 4 does not clarify the authors’ conclusion of higher stiffness for ApT and TpA dinucleotides. The authors should provide further explanation for this observation.

      We revised the text to clarify that the statement regarding ApT and TpA being the most stiff and the most flexible dinucleotides is not a conclusion derived from Figure 4, but rather from earlier work that we cite.

      (6) In Figure 7, the authors note that methylated CGIs have higher nucleosome occupancy on average than unmethylated sequences. Is this observation statistically significant?

      We observe that methylated sequences have a higher average occupancy than unmethylated sequences in Yazdi et al. data, when the CpG count falls into the intervals from 5 to 14 and from 15 to 24. For each of the two intervals this difference is statistically significant: the permutation test, used due to the lack of normality, yields a p-value of 0.0001 for both cases. The differences in mean scores shown in Figure 8 are also statistically significant. Such test results are expected, given the large sample sizes and the observed differences in means, therefore we prefer not to include this discussion in main text.

      (7) The authors note that their analyses to correlate nucleosome occupancy profile with the methylation state of underlying sequences are preliminary, as different cell lines were used to perform these analyses. Given this inconsistency, it needs to be clarified why this analysis was performed and what the takeaway is.

      We added the following comment at the end of the Results section: “Although comparing data from different cell lines is not optimal, to the best of our knowledge, no publicly available methylation and nucleosome occupancy data exist for the entire human genome within the same cell type. Nevertheless, since the lowest log probability densities in the human genome are predicted for CpG-rich sequences regardless of their methylation state (Figure 2d), and the same holds for both sets of the nucleosome occupancy scores (Figure 7), we conclude that the lowest occupancies occur for sequences with the lowest log probability densities.”

    1. eLife Assessment

      The authors addressed an important biological question, namely the role of glutamine metabolism in humoral responses, and they obtained solid conclusions. The strength of this study is that the authors used state-of-the-art transgenic mouse models together with in vitro analysis, thereby providing significant insights into the question posed. The following would strengthen the manuscript: i) adding more in-depth functionality/physiological relevance in the discussion part, and ii) regarding the experiments, the inclusion of more appropriate controls and a clearer and more accurate description of the methods.

    2. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Cho et al. present a comprehensive and multidimensional analysis of glutamine metabolism in the regulation of B cell differentiation and function during immune responses. They further demonstrate how glutamine metabolism interacts with glucose uptake and utilization to modulate key intracellular processes. The manuscript is clearly written, and the experimental approaches are informative and well-executed. The authors provide a detailed mechanistic understanding through the use of both in vivo and in vitro models. The conclusions are well supported by the data, and the findings are novel and impactful. I have only a few, mostly minor, concerns related to data presentation and the rationale for certain experimental choices.

      Detailed Comments:

      (1) In Figure 1b, it is unclear whether total B cells or follicular B cells were used in the assay. Additionally, the in vitro class-switch recombination and plasma cell differentiation experiments were conducted without BCR stimulation, which makes the system appear overly artificial and limits physiological relevance. Although the effects of glutamine concentration on the measured parameters are evident, the results cannot be confidently interpreted as true plasma cell generation or IgG1 class switching under these conditions. The authors should moderate these claims or provide stronger justification for the chosen differentiation strategy. Incorporating a parallel assay with anti-BCR stimulation would improve the rigor and interpretability of these findings.

      (2) In Figure 1c, the DMK alone condition is not presented. This hinders readers' ability to properly asses the glutaminolysis dependency of the cells for the measured readouts. Also, CD138+ in developing PCs goes hand in hand with decreased B220 expression. A representative FACS plot showing the gating strategy for the in vitro PCs should be added as a supplementary figure. Similarly, division number (going all the way to #7) may be tricky to gate and interpret. A representative FACS plot showing the separation of B cells according to their division numbers and a subsequent gating of CD138 or IgG1 in these gates would be ideal for demonstrating the authors' ability to distinguish these populations effectively.

      (3) A brief explanation should be provided for the exclusive use of IgG1 as the readout in class-switching assays, given that naïve B cells are capable of switching to multiple isotypes. Clarifying why IgG1 was preferentially selected would aid in the interpretation of the results.

      (4) The immunization experiments presented in Figures 1 and 2 are well designed, and the data are comprehensively presented. However, to prevent potential misinterpretation, it should be clarified that the observed differences between NP and OVA immunizations cannot be attributed solely to the chemical nature of the antigens - hapten versus protein. A more significant distinction lies in the route of administration (intraperitoneal vs. intranasal) and the resulting anatomical compartment of the immune response (systemic vs. lung-restricted). This context should be explicitly stated to avoid overinterpretation of the comparative findings.

      (5) NP immunization is known to be an inducer of an IgG1-dominant Th2-type immune response in mice. IgG2c is not a major player unless a nanoparticle delivery system is used. However, the authors arbitrarily included IgG2c in their assays in Figures 2 and 3. This may be confusing for the readers. The authors should either justify the IgG2c-mediated analyses or remove them from the main figures. (It can be added as supplemental information with proper justification).

      (6) Similarly, in affinity maturation analyses, including IgM is somewhat uncommon. I do not see any point in showing high affinity (NP2/NP20) IgMs (Figure 3d), since that data probably does not mean much.

      (7) Following on my comment for the PC generation in Figure 1 (see above), in Figure 4, a strategy that relies solely on CD40L stimulation is performed. This is highly artificial for the PC generation and needs to be justified, or more physiologically relevant PC generation strategies involving anti-BCR, CD40L, and various cytokines should be shown.

      (8) The effects of CB839 and UK5099 on cell viability are not shown. Including viability data under these treatment conditions would be a valuable addition to the supplementary materials, as it would help readers more accurately interpret the functional outcomes observed in the study.

      (9) It is not clear how the RNA seq analysis in Figure 4h was generated. The experimental strategy and the setup need to be better explained.

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors investigate the functional requirements for glutamine and glutaminolysis in antibody responses. The authors first demonstrate that the concentrations of glutamine in lymph nodes are substantially lower than in plasma, and that at these levels, glutamine is limiting for plasma cell differentiation in vitro. The authors go on to use genetic mouse models in which B cells are deficient in glutaminase 1 (Gls), the glucose transporter Slc2a1, and/or mitochondrial pyruvate carrier 2 (Mpc2) to test the importance of these pathways in vivo.

      Interestingly, deficiency of Gls alone showed clear antibody defects when ovalbumin was used as the immunogen, but not the hapten NP. For the latter response, defects in antibody titers and affinity were observed only when both Gls and either Mpc2 or Slc2a1 were deleted. These latter findings form the basis of the synthetic auxotrophy conclusion. The authors go on to test these conclusions further using in vitro differentiations, Seahorse assays, pharmacological inhibitors, and targeted quantification of specific metabolites and amino acids. Finally, the authors document reduced STAT3 and STAT1 phosphorylation in response to IL-21 and interferon (both type 1 and 2), respectively, when both glutaminolysis and mitochondrial pyruvate metabolism are prevented.

      Strengths:

      (1) The main strength of the manuscript is the overall breadth of experiments performed. Orthogonal experiments are performed using genetic models, pharmacological inhibitors, in vitro assays, and in vivo experiments to support the claims. Multiple antigens are used as test immunogens--this is particularly important given the differing results.

      (2) B cell metabolism is an area of interest but understudied relative to other cell types in the immune system.

      (3) The importance of metabolic flexibility and caution when interpreting negative results is made clear from this study.

      Weaknesses:

      (1) All of the in vivo studies were done in the context of boosters at 3 weeks and recall responses 1 week later. This makes specific results difficult to interpret. Primary responses, including germinal centers, are still ongoing at 3 weeks after the initial immunization. Thus, untangling what proportion of the defects are due to problems in the primary vs. memory response is difficult.

      (2) Along these lines, the defects shown in Figure 3h-i may not be due to the authors' interpretation that Gls and Mpc2 are required for efficient plasma cell differentiation from memory B cells. This interpretation would only be correct if the absence of Gls/Mpc2 leads to preferential recruitment of low-affinity memory B cells into secondary plasma cells. The more likely interpretation is that ongoing primary germinal centers are negatively impacted by Gls and Mpc2 deficiency, and this, in turn, leads to reduced affinities of serum antibodies.

      (3) The gating strategies for germinal centers and memory B cells in Supplemental Figure 2 are problematic, especially given that these data are used to claim only modest and/or statistically insignificant differences in these populations when Gls and Mpc2 are ablated. Neither strategy shows distinct flow cytometric populations, and it does not seem that the quantification focuses on antigen-specific cells.

      (4) Along these lines, the conclusions in Figure 6a-d may need to be tempered if the analysis was done on polyclonal, rather than antigen-specific cells. Alum induces a heavily type 2-biased response and is not known to induce much of an interferon signature. The authors' observations might be explained by the inclusion of other ongoing GCs unrelated to the immunization.

    4. Reviewer #3 (Public review):

      Summary:

      In their manuscript, the authors investigate how glutaminolysis (GLS) and mitochondrial pyruvate import (MPC2) jointly shape B cell fate and the humoral immune response. Using inducible knockout systems and metabolic inhibitors, they uncover a "synthetic auxotrophy": When GLS activity/glutaminolysis is lost together with either GLUT1-mediated glucose uptake or MPC2, B cells fail to upregulate mitochondrial respiration, IL 21/STAT3 and IFN/STAT1 signaling is impaired, and the plasma cell output and antigen-specific antibody titers drop significantly. This work thus demonstrates the promotion of plasma cell differentiation and cytokine signaling through parallel activation of two metabolic pathways. The dataset is technically comprehensive and conceptually novel, but some aspects leave the in vivo and translational significance uncertain.

      Strengths:

      (1) Conceptual novelty: the study goes beyond single-enzyme deletions to reveal conditional metabolic vulnerabilities and fate-deciding mechanisms in B cells.

      (2) Mechanistic depth: the study uncovers a novel "metabolic bottleneck" that impairs mitochondrial respiration and elevates ROS, and directly ties these changes to cytokine-receptor signaling. This is both mechanistically compelling and potentially clinically relevant.

      (3) Breadth of models and methods: inducible genetics, pharmacology, metabolomics, seahorse assay, ELISpot/ELISA, RNA-seq, two immunization models.

      (4) Potential clinical angle: the synergy of CB839 with UK5099 and/or hydroxychloroquine hints at a druggable pathway targeting autoantibody-driven diseases.

      Weaknesses:

      (1) Physiological relevance of "synthetic auxotrophy"

      The manuscript demonstrates that GLS loss is only crippling when glucose influx or mitochondrial pyruvate import is concurrently reduced, which the authors name "synthetic auxotrophy". I think it would help readers to clarify the terminology more and add a concise definition of "synthetic auxotrophy" versus "synthetic lethality" early in the manuscript and justify its relevance for B cells.

      While the overall findings, especially the subset specificity and the clinical implications, are generally interesting, the "synthetic auxotrophy" condition feels a little engineered. Therefore, the findings strongly raise the question of the likelihood of such a "double hit" in vivo and whether there are conditions, disease states, or drug regimens that would realistically generate such a "bottleneck". Hence, the authors should document or at least discuss whether GC or inflamed niches naturally show simultaneous downregulation/lack of glutamine and/or pyruvate. The authors should also aim to provide evidence that infections (e.g., influenza), hypoxia, treatments (e.g., rapamycin), or inflammatory diseases like lupus co-limit these pathways.

      It would hence also be beneficial to test the CB839 + UK5099/HCQ combinations in a short, proof-of-concept treatment in vivo, e.g., shortly before and after the booster immunization or in an autoimmune model. Likewise, it may also be insightful to discuss potential effects of existing treatments (especially CB839, HCQ) on human memory B cell or PC pools.

      (2) Cell survival versus differentiation phenotype

      Claims that the phenotypes (e.g., reduced PC numbers) are "independent of death" and are not merely the result of artificial cell stress would benefit from Annexin-V/active-caspase 3 analyses of GC B cells and plasmablasts. Please also show viability curves for inhibitor-treated cells.

      (3) Subset specificity of the metabolic phenotype

      Could the metabolic differences, mitochondrial ROS, and membrane-potential changes shown for activated pan-B cells (Figure 5) also be demonstrated ex vivo for KO mouse-derived GC B cells and plasma cells? This would also be insightful to investigate following NP-immunization (e.g., NP+ GC B cells 10 days after NP-OVA immunization).

      (4) Memory B cell gating strategy

      I am not fully convinced that the memory-B-cell gate in Supplementary Figure 2d is appropriate. The legend implies the population is defined simply as CD19+GL7-CD38+ (or CD19+CD38++?), with no further restriction to NP-binding cells. Such a gate could also capture naïve or recently activated B cells. From the descriptions in the figure and the figure legend, it is hard to verify that the events plotted truly represent memory B cells. Please clarify the full gating hierarchy and, ideally, restrict the MBC gate to NP+CD19+GL7-CD38+ B cells (or add additional markers such as CD80 and CD273). Generally, the manuscript would benefit from a more transparent presentation of gating strategies.

      (5) Deletion efficiency

      mRNA data show residual GLS/MPC2 transcripts (Supplementary Figure 8). Please quantify deletion efficiency in GC B cells and plasmablasts.

    1. Author response:

      The following is the authors’ response to the original reviews

      eLife Assessment

      This valuable study revisits the effects of substitution model selection on phylogenetics by comparing reversible and non-reversible DNA substitution models. The authors provide evidence that 1) non time-reversible models sometimes perform better than general time-reversible models when inferring phylogenetic trees out of simulated viral genome sequence data sets, and that 2) non time-reversible models can fit the real data better than the reversible substitution models commonly used in phylogenetics, a finding consistent with previous work. However, the methods are incomplete in supporting the main conclusion of the manuscript, that is that non time-reversible models should be incorporated in the model selection process for these data sets.

      The non-reversible models should be incorporated in the selection model process not because the significantly perform better but only because the do not perform worse than the reversible models and that true biochemical processes of nucleotide substitution does support the science of non-reversibility.

      Reviewer #1 (Public Review):

      The study by Sianga-Mete et al revisits the effects of substitution model selection on phylogenetics by comparing reversible and non-reversible DNA substitution models. This topic is not new, previous works already showed that non-reversible, and also covarion, substitution models can fit the real data better than the reversible substitution models commonly used in phylogenetics. In this regard, the results of the present study are not surprising. Specific comments are shown below.

      True

      It is well known that non-reversible models can fit the real data better than the commonly used reversible substitution models, see for example,

      https://academic.oup.com/sysbio/article/71/5/1110/6525257

      https://onlinelibrary.wiley.com/doi/10.1111/jeb.14147?af=R

      The manuscript indicates that the results (better fitting of non-reversible models compared to reversible models) are surprising but I do not think so, I think the results would be surprising if the reversible models provide a better fitting.

      I think the introduction of the manuscript should be increased with more information about non-reversible models and the diverse previous studies that already evaluated them. Also I think the manuscript should indicate that the results are not surprising, or more clearly justify why they are surprising.

      The surprise in the findings is in NREV12 performing better than NREV6 for double stranded DNA viruses as it was expected that NREV6 would perform better given the biochemical processes discussed in the introduction.

      In the introduction and/or discussion I missed a discussion about the recent works on the influence of substitution model selection on phylogenetic tree reconstruction. Some works indicated that substitution model selection is not necessary for phylogenetic tree reconstruction,

      https://academic.oup.com/mbe/article/37/7/2110/5810088

      https://www.nature.com/articles/s41467-019-08822-w

      https://academic.oup.com/mbe/article/35/9/2307/5040133

      While others indicated that substitution model selection is recommended for phylogenetic tree reconstruction,

      https://www.sciencedirect.com/science/article/pii/S0378111923001774

      https://academic.oup.com/sysbio/article/53/2/278/1690801

      https://academic.oup.com/mbe/article/33/1/255/2579471

      The results of the present study seem to support this second view. I think this study could be improved by providing a discussion about this aspect, including the specific contribution of this study to that.

      In our conclusion we have stated that:

      The lack of available data regarding the proportions of viral life cycles during which genomes exist in single and double stranded states makes it difficult to rationally predict the situations where the use of models such as GTR, NREV6 and NREV12 might be most justified: particularly in light of the poor over-all performance of NREV6 and GTR relative to NREV12 with respect to describing mutational processes in viral genome sequence datasets. We therefore recommend case-by-case assessments of NREV12 vs NREV6 vs GTR model fit when deciding whether it is appropriate to consider the application of non-reversible models for phylogenetic inference and/or phylogenetic model-based analyses such as those intended to test for evidence of natural section or the existence of molecular clocks.

      The real data was downloaded from Los Alamos HIV database. I am wondering if there were any criterion for selecting the sequences or if just all the sequences of the database for every studied virus category were analysed. Also, was any quality filter applied? How gaps and ambiguous nucleotides were considered? Notice that these aspects could affect the fitting of the models with the data.

      We selected varying number of sequences of the database for every studied virus type. Using the software aliview we did quality filter by re-aligning the sequences per virus type.

      How the non-reversible model and the data are compared considering the non-reversible substitution process? In particular, given an input MSA, how to know if the nucleotide substitution goes from state x to state y or from state y to state x in the real data if there is not a reference (i.e., wild type) sequence? All the sequences are mutants and one may not have a reference to identify the direction of the mutation, which is required for the non-reversible model. Maybe one could consider that the most abundant state is the wild type state but that may not be the case in reality. I think this is a main problem for the practical application of non-reversible substitution models in phylogenetics.

      True

      Reviewer #1 (Recommendations for the authors):

      The reversible and non-reversible models used in this study assume that all the sites evolve under the same substitution matrix, which can be unrealistic. This aspect could be mentioned.

      Done

      The manuscript indicates that "a phylogenetic tree was inferred from an alignment of real sequences (Avian Leukosis virus) with an average sequence identity (API) of ~90%.". I was wondering under which substitution model that phylogenetic tree reconstruction was performed? could the use of that model bias posterior results in terms of favoring results based on such a model?

      We have stated that the GTR+G model was used to reconstruct the tree. The use of the GTR+G model could yes bias the posterior results as we have stated in the paper too.

      I was wondering which specific R function was used to calculate the weighted Robinson-Foulds metric. I think this should be included in the manuscript.

      We stated that We used the weighted Robinson-Foulds metric (wRF; implemented in the R phangorn package (Schliep, 2011)⁠)

      Despite a minority, several datasets fitted better with a reversible model than with a non-reversible model. I think that should be clearly indicated. In addition, in my opinion the AIC does not enough penalizes the number of parameters of the models and favors the non-reversible models over the reversible models, but this is only my opinion based on the definition of AIC and it is not supported. Thus, I think the comparison between phylogenetic trees reconstructed under different substitution models was a good idea (but see also my second major comment).

      Noted

      When comparing phylogenetic trees I was wondering if one should consider the effect of the estimation method and quality of the studied data? For example, should bootstrap values be estimated for all the ancestral nodes and only ancestral nodes with high support be evaluated in the comparison among trees?

      Yes the estimation method and quality of the studied data should be considered. When using RF unlike wRF this will not matter but for weighted RF it does. When building the trees, using RaxML only high support nodes are added to the tree.

      In Figure 3, I do not see (by eye) significant differences among the models. I see in the legend that the statistical evaluation was based on a t test but I am not much convinced. Maybe it is only my view. Exactly, which pairs of datasets are evaluated with the t test? Next, I would expect that the influence of the substitution model on the phylogenetic tree reconstruction is higher at large levels of nucleotide diversity because with more substitution events there is more information to see the effects of the model. However, the t test seems to show that differences are only at low levels of nucleotide diversity (and large DNR), what could be the cause of this?

      The paired T-tests compares the wRF distances of the inferred tree real tree and the trees simulated using the GTR model verses the wRF distances of the inferred true tree from the trees simulated using the NREV12 model.

      The reason why the influence of the NREV12 model on the tree reconstructed is not significantly higher at large levels of nucleotide diversity could be because at a certain level the DNR are simply unrealistic.

      Can the user perform substitution model selection (i.e., AIC) among reversible and non-reversible substitution models with IQTREE? If yes, then doing that should be the recommendation from this study, correct?

      But, can DNR be estimated from a real dataset? DNR seems to be the key factor (Figure 3) for the phylogenetic analysis under a proper model.

      Substitution model selection can be performed among reversible and non-reversible using both HyPhy and IQTREE. And we have recommended that model tests should be done as a first step before tree building. Estimating DNR from real datasets requires a substation rate matrix of a non-reversible.

      The manuscript has many text errors (including typos and incorrect citations). For example, many citations in page 20 show "Error! Reference source not found.". I think authors should double check the manuscript before submitting. Also, some text is not formally written. For example, "G represents gamma-distributed rates", rates of what? The text should be clear for readers that are not familiar with the topic (i.e., G represents gamma-distributed substitution rates among sites). In general, I recommend a detailed revision of the whole text of the manuscript.

      Done

      Reviewer #2 (Public Review):

      The authors evaluate whether non time reversible models fit better data presenting strand-specific substitution biases than time reversible models. Specifically, the authors consider what they call NREV6 and NREV12 as candidate non time-reversible models. On the one hand, they show that AIC tends to select NREV12 more often than GTR on real virus data sets. On the other hand, they show using simulated data that NREV12 leads to inferred trees that are closer to the true generating tree when the data incorporates a certain degree of non time-reversibility.

      Based on these two experimental results, the authors conclude that "We show that non-reversible models such as NREV12 should be evaluated during the model selection phase of phylogenetic analyses involving viral genomic sequences". This is a valuable finding, and I agree that this is potentially good practice.

      However, I miss an experiment that links the two findings to support the conclusion: in particular, an experiment that solves the following question: does the best-fit model also lead to better tree topologies?

      By NREV12 leading to inferred trees that are closer to the true generating tree as compared to GTR, it then shows that the best-fit model in this case being NREV12 leads to better tree topologies.

      On simulated data, the significance of the difference between GTR and NREV12 inferences is evaluated using a paired t test. I miss a rationale or a reference to support that a paired t test is suitable to measure the significance of the differences of the wRF distance. Also, the results show that on average NREV12 performs better than GTR, but a pairwise comparison would be more informative: for how many sequence alignments does NREV12 perform better than GTR?

      We have used the popular paired t-test as it is the most widely used when comparing means values between two matched samples where the difference of each mean pair is normally distributed. And the wRF distances do match the guidelines above.

      The paired t-test contains the pairwise comparison and the boxplots side by side show the pairwise wRF comparisions.

      Reviewer #2 (Recommendations for the authors):

      The authors reference Baele et al., 2010 for describing NREV6 and NREV12. I suggest using the same name used in the referenced paper: GNR-SYM and GNR respectively. Although I do not think there is a standard name for these models, I would use a previously used one.

      We have built studies based on the names NREV6 and NREV12. We would like to keep the naming as standard for our studies.

      GTR and NREV12 models are already described in many other papers. I do not see the need to include such an extensive description. Also, a reference should be included to the discrete Gamma rate categories [1]

      We included the extensive description to enable other readers who are not super familiar with these models better understanding since we have given the models our own naming different from those used in other papers.

      We have added referencing for the discrete gamma rate as recommended. (Yang, 1994)

      To evaluate the exhaustiveness and correctness of the results, I would recommend publishing as supplementary material the simulated data sets or the scripts for generating the data set, the scripts or command lines for the analysis, and the versions of the software used (e.g., IQTREE). Also, to strongly support the main conclusion of the manuscript, I suggest adding to the simulations section results the RF-distances of the best-fit selected model under AIC, AICc, and BIC as well.

      We can go ahead and submit all the needed datasets. The simulated data RF-Distances results are available and will be submitted. We cannot however add them to the main document as this will create very long data tables.

      In some instances, it is mentioned that the selection criterion used is AIC, while in others, AIC-c is referenced. Even in the table captions, both terms are mixed. It should be made clearer which criterion is being employed, as AIC is not suitable for addressing the overparameterization of evolutionary models, given that it does not account for the sample size. A previous pre-print of this article [2] does not mention AIC-c, but also explicitly includes the formulas for AIC that do not take the sample size into account, and reports the same results as this manuscript, what indicates that AIC and not AIC-c was used here. This should be clarified. It is recommended to use AIC-c instead of AIC, especially if the sample size to model parameters ratio is low [3]. Two things may be appointed here: some authors consider tree branch lengths as model free parameters and others do not. In this paper it is not specified how the model parameters are counted. AIC tends to select more parameterized models than AIC-c, and overparameterization can lead to different tree inferences, as evidenced in Hoff et al., 2016. Therefore, it is expected that NREV12 is more frequently selected than NREV6 and GTR.

      In my opinion, a pairwise comparison between GTR and NREV12 performance is of great interest here, and the whiskers plots are not useful. Scatterplots would display the results better.

      Boxplots are meant to offer a simplified view of the results as the paired t-tests does all of the comparisons. We shall provide the scatter plots as supplementary information so that readers can get full detailed plots as recommended.

      Some references are missing.

      Missing references added

    2. Reviewer #1 (Public Review):

      The study by Sianga-Mete et al revisits the effects of substitution model selection on phylogenetics by comparing reversible and non-reversible DNA substitution models. This topic is not new, previous works already showed that non-reversible, and also covarion, substitution models can fit the real data better than the reversible substitution models commonly used in phylogenetics. In this regard, the results of the present study are not surprising.

    3. Reviewer #2 (Public Review):

      The authors evaluate whether non time reversible models fit better data presenting strand-specific substitution biases than time reversible models. Specifically, the authors consider what they call NREV6 and NREV12 as candidate non time-reversible models. On the one hand, they show that AIC tends to select NREV12 more often than GTR on real virus data sets. On the other hand, they show using simulated data that NREV12 leads to inferred trees that are closer to the true generating tree when the data incorporates a certain degree of non time-reversibility. Based on these two experimental results, the authors conclude that "We show that non-reversible models such as NREV12 should be evaluated during the model selection phase of phylogenetic analyses involving viral genomic sequences". This is a valuable finding, and I agree that this is potentially good practice. However, I miss an experiment that links the two findings to support the conclusion: in particular, an experiment that solves the following question: does the best-fit model also lead to better tree topologies?

      [Editors' note: the reviewers were sent the revised submission and rebuttal and based on their response, an amended eLife Assessment has been formulated.]

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      In this manuscript, Gruber et al perform serial EM sections of the antennal lobe and reconstruct the neurites innervating two types of glomeruli one that is narrowly tuned to geosmin and one that is broadly tuned to other odours. They quantify and describe various aspects of the innervations of olfactory sensory neurons (OSNs), uniglomerlular projection neurons (uPNs), and the multiglomerular Local interneurons (LNs) and PNs (mPNs). They find that narrowly tuned glomeruli had stronger connectivity from OSNs to PNs and LNs, and considerably more connections between sister OSNs and sister PNs than the broadly tuned glomeruli. They also had less connectivity with the contralateral glomeruli. These observations are suggestive of strong feed-forward information flow with minimal presynaptic inhibition in narrowly tuned glomeruli, which might be ecologically relevant, for example, while making quick decisions such as avoiding a geosmin-laden landing site. In contrast, information flow in more broadly tuned glomeruli show much more lateralisation of connectivity to the contralateral glomerulus, as well as to other ipsilateral glomeruli. 

      The data are well presented, the manuscript clearly written, and the results will be useful to the olfaction community. I wonder, given the hemibrain and FAFB datasets exist, whether the authors have considered verifying whether the trends they observe in connectivity hold across three brains? Is it stereotypic? 

      We appreciate the reviewer’s positive view of our study and their thoughtful and relevant comment on the issue of individual variation. We agree in that this is a very important question and notice that it was also asked for by the second Reviewer. It reflects both our limited understanding of the range of individual variation in synaptic connectivity—whether in flies, humans, or other species—and the challenge of determining which of the differences observed in our study are stereotypical features of each glomerulus type. Undoubtedly this criticism addresses a crucial problem of practically all connectome studies so far and for which there is no immediate solution. This type of studies requires so much time, efforts and money that increasing the number of samples is seldom feasible. The Reviewer wonders if we could compare our data with that made available by two of the largest connectome studies of Drosophila. This appeared to us to be a very good idea and we have tried to follow the advice but, unfortunately, it was impracticable because of the reasons we explain below. The hemibrain data cannot be used for this purpose because it does not contain the full glomerulus DA2 (Schlegel et al., 2021). A different problem hindered us from using the FAFB dataset, the other dataset mentioned by the Reviewer. In this case the three glomeruli were sectioned and reconstructed but the dataset lacks an annotated list of all synaptic connections corresponding to each glomerulus. Such annotation (a compendium of all synaptic connections inside each glomerulus informing for each connection which type of neuron provides the presynaptic site and which the postsynaptic site) is essential for direct comparison with our data. It is important to keep in mind that the current analytical tools available for the use of these datasets (e.g., NeuPrint, FlyWire and CATMAID) do not offer the ability to extract data on synapses exclusively from the glomerular volume of DA2 or DL5. In this case, it certainly is theoretically possible to obtain the data by doing ourselves the annotation. However, such a study will demand so much time, efforts and financial resources, which we believe would not be justified solely to increase the number of individuals from one to two. Instead, our manuscript includes a comparison of the OSN connectivity in VA1v and DL5 using the hemibrain dataset published by Schlegel et al. (2021) (see revised manuscript: lines 311–315; 431–434; 558–562; 602–606).

      Beyond the opinion, that we share in full with the Reviewer, that a comparison including three flies will be better than a comparison made with one glomerulus of each type we are still challenged by the question of which -if any- of the differences are stereotypic. The clarification of what are stereotypical differences between particular glomeruli in features as those discussed in our study and what is simply differences within the normal range of individual variation is basically a statistical problem. A first attempt at a comprehensive comparison focusing on intra- and inter-individual variability was recently made by comparing two connectome datasets from two different Drosophila individuals (Dorkenwald et al., 2024; Schlegel et al., 2024). At present, it is still unclear how many samples are needed to make a statistically robust comparison of olfactory synaptic circuits in adult flies—perhaps 3, 6, or even 18 individuals?  

      Reviewer #2 (Public Review):

      The chemoreceptor proteins expressed by olfactory sensory neurons differ in their selectivity such that glomeruli vary in the breadth of volatile chemicals to which they respond. Prior work assessing the relationship between tuning breadth and the demographics of principal neuron types that innervate a glomerulus demonstrated that narrowly tuned glomeruli are innervated more projection neurons (output neurons) and fewer local interneurons relative to more broadly tuned glomeruli. The present study used high-resolution electron microscopy to determine which synaptic relationships between principal cell types also vary with glomerulus tuning breadth using a narrowly tuned glomerulus (DA2) and a broadly tuned glomerulus (DL5). The strength of this study lies in the comprehensive, synapse-level resolution of the approach. Furthermore, the authors implement a very elegant approach of using a 2-photon microscope to score the upper and lower bounds of each glomerulus, thus defining the bounds of their restricted regions of interest. There were several interesting differences including greater axo-axonic afferent synapses and dendrodentric output neuron synapses in the narrowly tuned glomerulus, and greater synapses upon sensory afferents from multiglomerular neurons and output neuron autapses in the broadly tuned glomerulus.     The study is limited by a few factors. There was a technical need to group all local interneurons, centrifugal neurons, and multiglomerular projection neurons into one category ("multiglomerular neurons") which complicates any interpretations as even multiglomerular projection neurons are very diverse. Additionally, there were as many differences between the two narrowly tuned glomeruli as there were comparing the narrowly and broadly tuned glomeruli. Architecture differences may therefore not reflect differences in tuning breadth, but rather the ecological significance of the odors detected by cognate sensory afferents. Finally, some synaptic relationships are described as differing and others as being the same between glomeruli, but with only one sample from each glomerulus, it is difficult to determine when measures differ when there is no measure of inter-animal variability. If these caveats are kept in mind, this work reveals some very interesting potential differences in circuit architecture associated with glomerular tuning breadth.

      This work establishes specific hypotheses about network function within the olfactory system that can be pursued using targeted physiological approaches. It also identifies key traits that can be explored using other high-resolution EM datasets and other glomeruli that vary in their tuning selectivity. Finally, the laser "branding" technique used in this study establishes a reduced-cost procedure for obtaining smaller EM datasets from targeted volumes of interest by leveraging the ability to transgenically label brain regions in Drosophila.

      CLASSIFICATION OF NEURONAL TYPES

      We agree that grouping diverse types of interneurons into a single category (referred to as MGNs) limits the ability to make interpretations about synaptic similarities and differences between specific neuronal types. This was, however, an unavoidable compromise resulting from our decision to generate a comprehensive, synapse-level reconstruction of the restricted regions encompassing the DA2 and DL5 glomeruli. As both reviewers have noted, this approach offers significant value and we hope the Editor will also recognize that this limitation does not prevent readers from gaining important and novel insights into the synaptic circuitry of these two glomeruli.  

      Similar to the approach taken by Tobin at al. (2017) we prioritized producing a densely reconstructed neuropile, in which no synapses were omitted (Tobin et al., 2017). The downside of this method is that not all synaptic connections could be reliably assigned to specific neuronal types, with about 12% remaining unassigned." We anticipate that future research, supported by advances in semi-automated tracing methods, improved imaging technologies, and increased personnel resources, will allow not only for the generation of more complete connectomes of the entire brain (Scheffer et al., 2020; Zheng et al., 2018), but also, for the accurate reconstruction and classification of individual synapses—even in highly complex regions such as the olfactory glomeruli. We also expect that a second complete connectome of a male Drosophila will soon become available, which will provide valuable opportunities for comparisons across individuals and between male and female brains in future studies.

      INTERGLOMERULAR DIFFERENCES

      Thank you for this insightful comment. It is indeed true that despite both DA2 and VA1v being narrowly tuned glomeruli, they exhibit considerable differences in specific connectivity features (e.g., relative synaptic strengths above certain thresholds) and that those differences can be as pronounced as those observed between DA2 and the broadly tuned DL5. For this reason, comparing each individual glomerulus to every other is not a practical or informative approach. To derive robust interpretations, we focused instead on whether two glomeruli that share a particular functional characteristic—namely, being narrowly tuned for single odorants—also share connectivity patterns that distinguish them from a broadly tuned reference glomerulus.

      Our results support this. Furthermore, additional connectomics data reinforce our conclusions.

      For example, OSN-OSN connectivity is stronger in the two narrowly tuned glomeruli (DA2 and VA1v) relative to the broadly tuned glomerulus (DL5). While these pairwise differences alone are not conclusive, the finding that the two narrowly tuned glomeruli studied here share features that distinguish them from the broadly tuned glomerulus supports our interpretation. We found further support for this idea in the data reported by Schlegel et al. (2021) further. In that dataset, other narrowly tuned glomeruli (DA1, DL3, and DL4) also exhibit stronger OSNOSN connectivity than other broadly tuned glomeruli (DM1 or DM4).

      We do not deny that there are many differences between any given pair of glomeruli, regardless of whether they are narrowly or broadly tunned. Instead, we propose that our findings on circuit features indicate that most of the observed differences actually grouped the two narrowly tuned glomeruli together relative to the broadly tuned glomerulus. A more concise summary is now provided in the newly added Figure 8. We also added explanatory lines of text in the beginning of the chapter ‘specific features of narrowly tuned glomerular circuits. 

      ECOLOGICAL SIGNIFICANCE

      This is an interesting point. However, it is difficult to disentangle the "ecological significance" of processed odorants from the "tuning breadth" of a glomerulus. In the Drosophila olfactory system, glomerular circuits that respond to ecologically important odorants—such as those involved in reproduction or danger—tend to be more narrowly tuned. Moreover, while we refer to odorants with specific ecological significance as those linked to survival or reproductive behaviors, defining the significance of an odorant with precision is inherently challenging, as it can vary depending on context and environmental conditions.

      What both circuits share is their narrow tuning breadth. We therefore propose that the common circuit features of VA1v and DA2, highlighted in this study, are functionally related to the fact that each circuit processes single odorants. Consequently, their specificity is most likely determined at the level of the receptor. 

      INDIVIDUAL VARIABILITY

      We agree that accounting for inter-animal variability would strengthen the study. However, we are confident that even a modest statistically sound assessment of this variability would require a larger sample size, certainly more than just two or three flies, which is presently not feasible.

      We refer the reviewer to our response to Reviewer #1 regarding this important issue.

      Initial insights into variability between flies have been provided through comparative analyses of the two most comprehensive female Drosophila melanogaster connectomes—the FAFB and hemibrain datasets (Schlegel et al., 2024). For more detailed quantitative comparisons regarding inter-animal variability, please refer to our response to the second major point raised by Reviewer #2. As highlighted by Schlegel et al. (2024), making definitive statements about the stereotypy of neuron numbers, unitary cell-cell connections (edges), or synaptic strengths (weights) remains a complex challenge."

      While appreciating the rigour of this work we were surprised to notice the omission of a comparison of their observations with the two other existing datasets. This would not only have addressed the technical limitation of this particular study - the inability to identify specific neuron types due to imaging a small part of the brain - but would also have shed light on inter-animal variability 

      We strongly recommend that the authors do make this comparison - the datasets are currently extremely user friendly and so we don't estimate the replication of their key findings will be too onerous. This will be particularly important to resolve the issue of having to classify all multiglomerular local interneurons and multiglomerular projection neurons - broadly into "MGN. Such a comparison will dramatically strengthen this study that poses very interesting questions, but in its current form, has this striking shortcoming. 

      INDIVIDUAL VARIABILITY AS EXPRESSED HERE:

      Earlier on we were of the same opinion that the Reviewer express here but, unfortunately, it was not possible to follow his advice. As far as it was possible, we have compared some of our results to the values of the two datasets that the Reviewer refers to, but the absence of glomerulus DA2 in one of the datasets and the absence of synapse annotation for all the relevant glomeruli in the other dataset prevented us from making a full comparison. Moreover, believe that the problem of individual variation most probably cannot be solved by increasing the comparison with one or two more flies.

      Reviewer #1 (Recommendations for The Authors): 

      The lines 270 - 282 confused me in the backdrop of Figure 3B. 

      The concern may stem from our inclusion of a comparison between the uPNs of glomerulus DA2 and the single uPN of glomerulus DL5 in the statistical analysis presented in Figure 3. This comparison was included to ensure a comprehensive representation of the data, highlighting the variability across all major cell groups. We have clarified this rationale in the revised manuscript (see lines 274-282).

      Reviewer #2 (Recommendations for The Authors): 

      I commend the authors for taking such a thorough approach to advance an interesting topic in olfaction. The following suggestions are intended to strengthen this study: 

      Major points: 

      A color-blind-friendly palette should be used for all figures. Currently, five of seven figures use red and green, and in particular, Figure 5 will be uninterpretable for red/green color-blind readers. 

      We are thankful for this important comment. We changed the color palette as suggested by the reviewer, and replaced Red with Magenta and changed the figure legend accordingly.

      This level of analysis is extremely resource and time-consuming, so even obtaining this information at this resolution is an impressive achievement. However, this study would be well served by strategically supplementing the analysis of this dataset with information from other publicly available connectomics datasets. For instance, some interpretations are limited because there is information from only a single DL5 and DA2 glomerulus. Any claims in which one glomerulus has more, less, or the same of a metric must be tempered because without replicates, there are no measures of inter-animal variability. As an example, on lines 386-387 the authors state "The relative synaptic strength between MGN>uPN was stronger in DA2 (12%) than DL5 (10%)". It is difficult to assess whether this represents a difference that is outside of the range of inter-animal variability inherent to the olfactory system. Taking select measures from the Hemibrain and FAFB (via FlyWire) datasets could help strengthen these claims. 

      We fully agree with the Reviewer’s opinion that since our data is from one glomerulus of each type “It is difficult to assess whether this represents a difference that is outside of the range of inter-animal variability inherent to the olfactory system.” This is a weakness of practically all connectome studies based on electron microscopy in both Drosophila and other animals We cannot be sure that measurements from the Hemibrain and FAFB datasets could help strengthen our claims, because the magnitude of the range of individual variation is presently not known and most probably solving this problem will require more than one or two more flies. In any case, it is not possible to follow this advice and compare our data with that of the hemibrain because the DA2 was not included in that study. We ask the Reviewer to read our more detailed explanation in our response to Reviewer 1.

      In the particular case commented by the Reviewer above, the relative difference in synaptic strength exceeds 20%. Whether such a difference has functional relevance remains an open question but Schlegel et al. (2024) support our interpretation. They showed that synaptic weights with differences larger than 20% tend to be consistent across individuals, with strong correlations within and between animals (Pearson’s R = 0.97 and R = 0.8; Fig. 4).

      Grouping all local interneurons, centrifugal neurons response and multiglomerular PNs into one category limits the ability to make interpretations about similarities or differences in the synaptic relationships involving MGNs. The authors could get an estimate of the number of multiglomerular PNs in DL5, VA1v, and DA2 from Hemibrain and FlyWire platforms to get a better sense of differences between glomeruli in the MGN category. 

      We agree in that grouping a variety of interneurons into a single category (called MGNs) limits the ability to make interpretations about similarities or differences in the synaptic relationships involving different neurons. This was the unavoidable price to be paid once we decided to register a “comprehensive, synapse-level resolution” map of these two glomeruli. It appears to us that both reviewers have clearly recognized the intrinsic value of this approach and we hope that the Editor will share this opinion. 

      Consistent with the assumptions of Tobin et al., (2017) our hypothesis on LN connectivity differences is based on the fact that they are the most numerous and broadly arborizing neurons of the class that we call multiglomerular neurons in the AL (Chou et al., 2010; Lin et al., 2012; Tanaka et al., 2012). Recent connectome studies confirm this feature across all glomeruli (Bates et al., 2020; Horne et al., 2018; Scheffer et al., 2020; Schlegel et al., 2021; Zheng et al., 2018).  

      In response to the reviewer’s question, we conducted a case-specific reanalysis of the data from Horne (2018), which provides comprehensive connectivity information for the VA1v glomerulus. This allowed us to quantify the proportional contributions of LNs (n = 56) and mPNs (n = 13) to all MGN connections (MGN-MGN, MGN>OSN, MGN>uPN, uPN>MGN, OSN>MGN).

      Our analysis showed that 84% of MGN output originates from LNs. 57% of the input to MGN comes from LNs and 43% from mPNs, largely due to strong OSN>mPN input. Thus, for the filtered MGN connections relevant to distinguishing narrowly from broadly tuned circuits (e.g., MGN>OSN, uPN>MGN; see Fig. 8), LNs are the dominant contributors in VA1v. (These data are not included in the resubmitted manuscript.) This supports our interpretation that the LN are responsible for the majority of MGN connections underlying the observed differences between glomeruli.

      For instance, prior work has reported fewer local interneurons innervating DA2, but in this study there was an unexpected result that there was greater MGN innervation density and synapse # for DA2 relative to DL5 This discrepancy could be due to differences in the number of multiglomerular PNs innervating each glomerulus, which would be obscured when these PNs are combined with local interneurons in the MGN category. 

      "We agree that the greater MGN innervation density in DA2 in our study could reflect a stronger contribution from mPNs. However, innervation density alone does not indicate how many mPNs actually innervate DA2 or DL5. Alternatively, increased innervation and/or synaptic frequency of local interneurons (LNs) could also account for this observation. In our view, neuron number does not necessarily correlate with branching complexity or synaptic density. 

      For example, the dendritic length of the single uPN in glomerulus DL5 is approximately equal to the combined dendritic length of the multiple uPNs of the DA2. Similarly, Tobin et al. (2017) reported that when comparing uPNs in glomerulus DM6 between the left and right brain hemispheres, they found variability in cell number but not in dendritic length. More recently, the FAFB and hemibrain datasets showed a similar pattern in another neuronal type. A substantial variation in cell number was observed for Kenyon cells between the two Drosophila individuals, but this cell type consistently makes and receives, in both individuals, similar presynapses and post-synapses (Schlegel et al., 2024).

      On line 33 the authors cannot claim that DA2-OSNs experience less presynaptic inhibition based on the data in this study. Even without the limitations of the MGN category (described above), presynaptic inhibition depends on more than just the number of synapses, rather it is affected by GABA B receptor expression levels and the second messenger components downstream of this receptor. Physiological experiments are needed to justify this claim, so I recommend adjusting accordingly.

      We agree with the Reviewer and have adjusted the text on line 33 and in the main body of the text by referring to this finding as “presynaptic input”, which is what we have quantified, instead of “less presynaptic inhibition”.

      Figures 5 and 6 seek to distill the wealth of information from this study into broad takehome points for the reader, while still providing a good amount of detail. I think a final more concise graphic summary (similar to the graphical abstract or Figure 6 of Grabe et al 2016) depicting the most critical differences between glomeruli would further clarify the broad findings of this study. 

      We appreciate this comment and we have added a “graphic summary” as the Reviewer proposed. We made a new figure that becomes Figure 8 and summarizes our results and highlights differences between narrowly and broadly tuned glomeruli in a more concise graphical abstract format.

      Minor points: 

      Much of the manuscript provides details about synapse fractions or % synapses for a given synaptic relationship. Please ensure that it is clear which principal cell types are being described, as it can be easy to get lost.  - Should line 284 say "...than DL5 as it has been reported that DA2 is innervated by fewer LNs..."?

      We appreciate the reviewer’s comment and we have corrected this sentence that now reads as follows: (see text: beginning at line 290).  

      Taisz et al.  has been published, so the citation should be updated. 

      We have updated the corresponding citation.  

      On line 233, the authors ascribe the small electron-dense vesicles as likely housing sNPF released by MGNs. However, Carlsson et al. (2010) demonstrated that sNPF is released by OSNs, which was further functionally characterized by Root et al. (2011) and Ko et al. (2014). In terms of MGNs that release neuropeptides, Carlsson et al. 2010 demonstrated that local interneurons immunolabel for tachykinin, myoinhibitory peptide, and allatostatin-A, while two extrinsic neurons release SIFamide. In theory, aminergic neurons could also have small electron-dense vesicles, but this can be variable. 

      The Reviewer is completely right in his criticism. The MGN certainly contain neurons that have been reported to contain neuropeptides other than sNPF. We have corrected this sentence and it now reads as follows (page7, line 236): “Interestingly, besides the abundant clear small vesicles..

      On line 636, the Berck and Schlegel studies demonstrated that panglomerular local interneurons synapse upon OSN, but not that they induce presynaptic inhibition (which was demonstrated in the studies cited in the next sentence). I recommend adjusting this sentence.

      We agree and we have corrected the text following the Reviewers advice. It now reads as follows (page 19. Line 663): “We also observed that OSNs received less MGN feedback.

    2. eLife Assessment

      This study seeks to determine how synaptic relationships between principal cell types in the olfactory system vary with glomerulus selectivity and is therefore valuable to the field. The methodology is solid, and with the caveat that here was a technical need to group all local interneurons, centrifugal neurons and multiglomerular projection neurons into one category ("multiglomerular neurons"), this work reveals some very interesting potential differences in circuit architecture associated with glomerular tuning breadth.

    3. Reviewer #1 (Public review):

      In this manuscript, Gruber et al perform serial EM sections of the antennal lobe and reconstruct the neurites innervating two types of glomeruli - one that is narrowly tuned to geosmin and one that is broadly tuned to other odours. They quantify and describe various aspects of the innervations of olfactory sensory neurons (OSNs), uniglomerlular projection neurons (uPNs), and the multiglomerular Local interneurons (LNs) and PNs (mPNs). They find that narrowly tuned glomeruli had stronger connectivity from OSNs to PNs and LNs, and considerably more connections between sister OSNs and sister PNs than the broadly tuned glomeruli. They also had less connectivity with the contralateral glomerluli. These observations are suggestive of strong feed-forward information flow with minimal presynaptic inhibition in narrowly tuned gomeruli, which might be ecologically relevant, for example, while making quick decisions such as avoiding a geosmin-laden landing site. In contrast, information flow in more broadly tuned glomeruli show much more lateralisation of connectivity to the contralateral glomerulus, as well as to other ipsilateral glomeruli.

      The data are well presented, the manuscript clearly written, and the results will be useful to the olfaction community. I had earlier suggested comparisons with other EM datasets that exist to investigate stereotypy, and am convinced by their efforts and reasons for which these were either not possible to do or not possible within the timeframe of a revision.

      Comments on revisions:

      Thank you for the careful responses to my suggestions. I hope that such approaches will be possible by others going forward.

    4. Reviewer #2 (Public review):

      The chemoreceptor proteins expressed by olfactory sensory neuron differ in their selectivity such that glomeruli vary in the breadth of volatile chemicals to which they respond. Prior work assessing the relationship between tuning breadth and the demographics of principal neuron types that innervate a glomerulus demonstrated that narrowly tuned glomeruli are innervated more projection neurons (output neurons) and fewer local interneurons relative to more broadly tuned glomeruli. The present study used high resolution electron microscopy to determine which synaptic relationships between principal cell types also vary with glomerulus tuning breadth using a narrowly tuned glomerulus (DA2) and a broadly tuned glomerulus (DL5). The strength of this study lies in the comprehensive, synapse-level resolution of the approach. Furthermore, the authors implement a very elegant approach of using a 2-photon microscope to score the upper and lower bounds of each glomerulus thus defining the bounds of their restricted regions of interest. Using the approach, the authors identify several architectural motifs that differ between glomeruli with different tuning properties

      In the revised version of this study the authors discuss several important limitations. There was a technical need to group all local interneurons, centrifugal neurons and multiglomerular projection neurons into one category ("multiglomerular neurons") which complicates interpretations as even multiglomerular projection neurons are very diverse. With only 2 narrowly tuned glomeruli and 1 broadly tuned glomerulus, architecture differences may reflect more than just differences in tuning breadth. Finally, the degree to which inter-animal variability may contribute to differences between glomeruli is discussed. If these caveats are kept in mind, this work reveals some very interesting potential differences in circuit architecture associated with glomerular tuning breadth.

      This work establishes specific hypotheses about network function within the olfactory system that can be pursued using targeted physiological approaches. It also identifies key traits that can be explored using other high resolution EM datasets and other glomeruli that vary in their tuning selectivity. Finally, the laser "branding" technique used in this study establishes a reduced cost procedure for obtaining smaller EM datasets from targeted volumes of interest by leveraging the ability to transgenically label brain regions in Drosophila.

      Comments on revisions:

      I appreciate the thoughtful responses that the authors made regarding the initial assessment of their study. The authors discuss these limitations in their manuscript which should not be viewed as criticisms, but rather caveats to be considered for this study specifically and in some instances, for all connectomics studies.

      I still believe there is a lost opportunity to make use of the FlyWire dataset to make specific strategic comparisons. I do not propose attempting to replicate the comprehensive nature of the main study, but querying cell type based on glomerular innervation would allow the authors to address consistency of observed differences between glomeruli as ORNs and uPNs have been thoroughly annotated and analysis can be limited by neuropil. I agree that it is unclear how many individuals would need to be examined to achieve sufficient statistical power, but some of the circuit motifs revealed in this study can be readily tested in the FlyWire dataset. For instance, the observation from this study that narrowly tuned ORNs receive less synaptic input from LNs is supported in FlyWire, with DL5 ORNs getting far more synaptic input from LNs relative to DA2 and VA1v. I'm not proposing repeating all of the analyses from this study, and there is no doubt that inter-animal variability and technical differences can explain different observations across datasets, but I believe these are considerations of which the readers (who can query these synaptic relationships in FlyWire) should be made aware.

    1. eLife Assessment

      The manuscript presents a valuable finding that CCDC32, beyond its reported role in AP2 assembly, follows AP2 to the plasma membrane and regulates clathrin-coated pit assembly and dynamics. The authors further identify an alpha-helical region within CCDC32 that is essential for its interaction with AP2 and its cellular function. While live-cell and ultrastructural imaging data are solid, future biochemical studies will be needed to confirm the proposed CCDC32-AP2 interaction.

      [Editors' note: this paper was reviewed by Review Commons.]

    2. Reviewer #1 (Public review):

      Yang et al. describes CCDC32 as a new clathrin mediated endocytosis (CME) accessory protein. The authors show that CCDC32 binds directly to AP2 via a small alpha helical region and cells depleted for this protein show defective CME. Finally, the authors show that the CCDC32 nonsense mutations found in patients with cardio-facial-neuro-developmental syndrome (CFNDS) disrupt the interaction of this protein to the AP2 complex. The results presented suggest that CCDC32 may act as both a chaperone (as recently published) and a structural component of the AP2 complex.

    3. Reviewer #2 (Public review):

      Summary:<br /> The authors responded to my previous concerns with additional arguments and discussion. While I do not object to the publication of this work, two critical experiments are still missing.

      Weaknesses:<br /> First, biochemical assays using recombinant proteins should be conducted to determine whether CCDC32 binds to the full AP2 adaptor or to specific AP2 intermediates, such as hemicomplexes. The current co-IP data from mammalian cell lysates are too complex to interpret conclusively. Second, cell fractionation should be performed to assess whether, and how, CCDC32 associates with membrane-bound AP2.

    4. Reviewer #3 (Public review):

      In this manuscript, Yang et al. characterize the endocytic accessory protein CCDC32, which has implications in cardio-facio-neuro-developmental syndrome (CFNDS). The authors clearly demonstrate that the protein CCDC32 has a role in the early stages of endocytosis, mainly through the interaction with the major endocytic adaptor protein AP2, and they identify regions taking part in this recognition. Through live cell fluorescence imaging and electron microscopy of endocytic pits, the authors characterize the lifetimes of endocytic sites, the formation rate of endocytic sites and pits and the invagination depth, in addition to transferrin receptor (TfnR) uptake experiments. Binding between CCDC32 and CCDC32 mutants to the AP2 alpha appendage domain is assessed by pull down experiments.

      Together, these experiments allow deriving a phenotype of CCDC32 knock-down and CCDC32 mutants within endocytosis, which is a very robust system, in which defects are not so easily detected. A mutation of CCDC32, mimicking CFNDS mutations, is also addressed in this study and shown to have endocytic defects.

      An experimental proof for the resistance of the different CCDC32 mutants to siRNA treatment would have helped to strengthen the conclusions.

      In summary, the authors present a strong combination of techniques, assessing the impact of CCDC32 in clathrin mediated endocytosis and its binding to AP2.

    5. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1 (Public review):

      This is a revision of a manuscript previously submitted to Review Commons. The authors have partially addressed my comments, mainly by expanding the introduction and discussion sections. Sandy Schmid, a leading expert on the AP2 adaptor and CME, has been added as a co-corresponding author. The main message of the manuscript remains unchanged. Through overexpression of fluorescently tagged CCDC32, the authors propose that, in addition to its established role in AP2 assembly, CCDC32 also follows AP2 to the plasma membrane and regulates CCP maturation. The manuscript presents some interesting ideas, but there are still concerns regarding data inconsistencies and gaps in the evidence.

      With due respect, we would argue that a role for CCDC32 in AP2 assembly is hardly ‘established’.  Rather a single publication reporting its role as a co-chaperone for AAGAP appeared while our manuscript was under review.  We find some similar and some conflicting results, which are described in our revised manuscript.  However, in combination our two papers clearly show that CCDC32, a previously unrecognized endocytic accessory protein, deserves further study.

      (1) eGFP-CCDC32 was expressed at 5-10 times higher levels than endogenous CCDC32. This high expression can artificially drive CCDC32 to the cell surface via binding to the alpha appendage domain (AD)-an interaction that may not occur under physiological conditions.

      While we acknowledge that overexpression of eGFP-CCDC32 could result in artificially driving it to CCPs, we do not believe this is the case for the following reasons:

      i. The bulk of our studies (Figures 2-4) demonstrate the effects of siRNA knockdown on CCDC32 on CCP early stages of CME, and so it is likely that these functions require the presence of endogenous CCDC32 at nascent CCPs as detected with overexpressed eGFP-CCDC32 by TIRF imaging.

      ii. At these levels of overexpression eGFP-CCDC32 fully rescues the effects of siRNA KD of endogenous CCCDC32 of Tfn uptake and CCP dynamics (Figure 6F,G). If the protein was artificially recruited to the AP2 appendage domain, one would expect it to compete with the recruitment of other EAPS to CCPs and hence exhibit defects in CCP dynamics. Indeed, we see the opposite: CCPs that are positive for eGFP-CCDC32 show normal dynamics and maturation rates, while CCPs lacking eGFP-CCDC32 are short-lived and more likely to be aborted (Figure 1C).

      iii. We have identified two modes of binding of CCDC32 to AP2 adaptors: one is through canonical AP2-AD binding motifs, the second is through an a-helix in CCDC32 that, by modeling, docks only to the open conformation of AP2.  Overexpressed CCDC32 lacking this a-helix is not recruited to CCPs (Fig. 6 D,E), indicating that the canonical AP2 binding motifs are not sufficient to recruit CCDC32 to CCPs, even when overexpressed.

      (2) Which region of CCDC32 mediates alpha AD binding? Strangely, the only mutant tested in this work, Δ78-98, still binds AP2, but shifts to binding only mu and beta. If the authors claim that CCDC32 is recruited to mature AP2 via the alpha AD, then a mutant deficient in alpha AD binding should not bind AP2 at all. Such a mutant is critical for establish the model proposed in this work.

      We understand the reviewer’s confusion and thus devoted a paragraph in the discussion to this issue.  As revealed by AlphaFold 3.0 modeling (Figure S6) binding of CCDC32 to the alpha AD likely occurs via the 2 canonical AP2-AD binding motifs encoded in CCDC32. Given the highly divergent nature of AP2-AD binding motifs, we did not identify these motifs without the AlphaFold 3.0 modeling. While these interactions could be detected by GST-pull downs, they are apparently not of sufficient affinity to recruit CCDC32 to CCPs in cells. In the text, we now describe the a-helix we identified as being essential of CCP recruitment as ‘a’ AP2 binding site on CCDC32 rather than ‘the’ AP2 binding site.  Interestingly, and also discussed, Alphafold 3.0 identifies a highly predicted docking site on a-adaptin that is only accessible in the open, cargo-bound conformation of intact AP2.  This is also consistent with the inability of CCDC32(D78-99) to bind the a:µ2 hemi-complex in cell lysates.

      We agree that further structural studies on CCDC32’s interactions with AP2 and its targeting to CCPs will be of interest for future work.

      (3) The concept of hemicomplexes is introduced abruptly. What is the evidence that such hemicomplexes exist? If CCDC32 binds to hemicomplexes, this must occur in the cytosol, as only mature AP2 tetramers are recruited to the plasma membrane. The authors state that CCDC32 binds the AD of alpha but not beta, so how can the Δ78-98 mutant bind mu and beta?

      We introduced the concept of hemicomplexes based on our unexpected (and now explicitly stated as such) finding that the CCDC32(D78-99) mutant efficiently co-IPs with a b2:µ2 hemicomplex.  As stated, the efficiency of this pulldown suggests that the presumed stable AP2 heterotetramer must indeed exist in equilibrium between the two a:s2 and b2:µ2 hemicomplexes, such that CCDC32(D78-99) can sequester and efficiently co-IP with the b2:µ2 hemicomplex.  A previous study, now cited, had shown that the b2:µ2 hemicomplex could partially rescue null mutations of a in C. elegans (PMID: 23482940).  We do not know how CCDC32 binds to the b2:µ2 hemicomplex and we did not detect these interactions using AlphaFold 3.0. However, these interactions could be indirect and involve the AAGAB chaperone.  It is also likely, based on the results of Wan et al. (PMID: 39145939), that the binding is through the µ2 subunit rather than b2. As mentioned above, and in our Discussion, further studies are needed to define the complex and multi-faceted nature of CCDC32-AP2 interactions.

      (4) The reported ability of CCDC32 to pull down AP2 beta is puzzling. Beta is not found in the CCDC32 interactome in two independent studies using 293 and HCT116 cells (BioPlex). In addition, clathrin is also absent in the interactome of CCDC32, which is difficult to reconcile with a proposed role in CCPs. Can the authors detect CCDC32 binding to clathrin?

      Based on the studies of Wan et al. (PMID: 39145939), it is likely that CCDC32 binds to µ2, rather than to the b2 in the b2:µ2 hemicomplex.  As to clathrin being absent from the CCDC32 pull down, this is as expected since the interactions of clathrin even with AP2 are weak in solution (as shown in Figure 5C, clathrin is not detected in our AP2 pull down) so as not to have spontaneous assembly of clathrin coats in the cytosol. Rather these interactions are strengthened by both the reduction in dimensionality that occurs on the membrane and by avidity of multivalent interactions.  For example, Kirchausen reported that 2 AP2 complexes are required to recruit one clathrin triskelion to the PM.

      (5) Figure 5B appears unusual-is this a chimera?

      Figure 5B shows an internal insertion of the eGFP tag into an unstructured region in the AP2 hinge. As we have previously shown (PMID: 32657003), this construct, unique among other commonly used AP2 tags, is fully functional.  We have rearranged the text in the Figure legend to make this clearer.

      Figure 5C likely reflects a mixture of immature and mature AP2 adaptor complexes.

      This is possible, but mature heterotetramers are by far the dominant species, otherwise the 4 subunits would not be immuno-precipitated at near stoichiometric levels with the a subunit.  Near stoichiometric IP with antibodies to the a-AD have been shown by many others in many cell types. 

      (6) CCDC32 is reduced by about half in siRNA knockdown. Why not use CRISPR to completely eliminate CCDC32 expression?

      Fortuitously, partial knockdown was essential to reveal this second function of CCDC32, as we have emphasized in our Discussion.  Wan et al, used CRISPR to knockout CCDC32 and reveal its essential role as a AAGAB co-chaperone.  In the complete absence of CCDC32 mature AP2 complexes fail to form.  However, under our conditions of partial CCDC32 depletion, the expression of AP2 heterotetramers is unaffected revealing a second function of CCDC32 at early stages of CME.  We expect that the co-chaperone function of CCDC32 is catalytic, while its role in CME is more structural; hence the different concentration dependencies, the former being less sensitive to KD than the latter.  This is one reason that many researchers are turning to CRISPRi for whole genome perturbation studies as many proteins play multiple roles that can be masked in KO studies.

      Reviewer #2 (Public review):

      Yang et al. describes CCDC32 as a new clathrin mediated endocytosis (CME) accessory protein. The authors show that CCDC32 binds directly to AP2 via a small alpha helical region and cells depleted for this protein show defective CME. Finally, the authors show that the CCDC32 nonsense mutations found in patients with cardio-facial-neuro-developmental syndrome (CFNDS) disrupt the interaction of this protein to the AP2 complex. The results presented suggest that CCDC32 may act as both a chaperone (as recently published) and a structural component of the AP2 complex.

      Strengths:

      The conclusions presented are generally well supported by experimental data and the authors carefully point out the differences between their results and the results by Wan et al. (PNAS 2024).

      Weaknesses:

      The experiments regarding the role of CCDC32 in CFNDS still require some clarifications to make them clearer to scientists working on this disease. The authors fail to describe that the CCDC32 isoform they use in their studies is different from the one used when CFNDS patient mutations were described. This may create some confusion. Also, the authors did not discuss that the frame-shift mutations in patients may be leading to nonsense mediated decay.

      As requested we have more clearly described our construct with regard to the human mutations and added the possibility of NMD in the context of the human mutations.

      Reviewer #3 (Public review):

      In this manuscript, Yang et al. characterize the endocytic accessory protein CCDC32, which has implications in cardio-facio-neuro-developmental syndrome (CFNDS). The authors clearly demonstrate that the protein CCDC32 has a role in the early stages of endocytosis, mainly through the interaction with the major endocytic adaptor protein AP2, and they identify regions taking part in this recognition. Through live cell fluorescence imaging and electron microscopy of endocytic pits, the authors characterize the lifetimes of endocytic sites, the formation rate of endocytic sites and pits and the invagination depth, in addition to transferrin receptor (TfnR) uptake experiments. Binding between CCDC32 and CCDC32 mutants to the AP2 alpha appendage domain is assessed by pull down experiments. While interaction between CCDC32 and the alpha appendage domain of AP2 is clearly described, a discussion of potential association with other AP2 domains would be beneficial to understand the impact of CCDC32 in endocytosis.

      The reviewer is correct. That CCDC32 also interacts with other subunits of AP2, is evident from the findings of Wan et al. and by the fact that the CCDC32(D78-99) mutant efficiently co-IPs with the b2:µ2 hemicomplex.  We expanded our discussion around this point. CCDC32 remains an, as yet, poorly characterized, but we now believe very interesting EAP worth further study.

      Together, these experiments allow deriving a phenotype of CCDC32 knock-down and CCDC32 mutants within endocytosis, which is a very robust system, in which defects are not so easily detected. A mutation of CCDC32, mimicking CFNDS mutations, is also addressed in this study and shown to have endocytic defects.

      In summary, the authors present a strong combination of techniques, assessing the impact of CCDC32 in clathrin mediated endocytosis and its binding to AP2.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      (1) The authors must be clear about the differences between the CCDC32 isoform they used in their manuscript and the one used to describe the patient mutations. This could be done, for example, in the methods. This is essential for the capacity of other labs to reproduce, follow up and correctly cite these results.

      We have added this information to the Methods. 

      (2) I believe the authors have misunderstood what nonsense mediated decay is. NMD occurs at the mRNA level and requires a full genome context to occur (introns and exons). The fact that a mutant protein is expressed normally from a construct by no means prove that it does not happen. I believe that adding the possibility of NMD occurring would enrich the discussion.

      Thank you, we have now done more homework and have added this possibility into our discussion of the mutant phenotype.  However, if a robust NMD mechanism resulted in a complete loss of CCDC42 protein, then the essential co-chaperone function reported by Wan et al, would result in complete loss of AP2.  A more detailed characterization of the cellular phenotype of these mutations, including assessing the expression levels of AP2 would be informative.

      Reviewer #3 (Recommendations for the authors):

      - It is not clear what the authors mean by '~30s lifetime cohort' (line 159). They refer to Figure 2H, which shows the % of CCPs. Can the authors explain exactly what kind of tracks they used for this analysis, for example which lifetime variations were accepted? Do they refer to the cohorts in Figure S4? In Figure S4, the most frequent tracks have lifetimes < 20 s (in contrast to what is stated in the main text). Why was this cohort not used?

      The ‘30s cohort’ refers to CCPs with lifetimes between 25-35s which encompasses the most abundant species in control cells and CCDC32 KD cells, as shown by the probability curves in Figure 2H. Given the large number of CCPs analyzed we still have large numbers for our analyses n=5998 and 4418, for control and siRNA treated conditions, respectively.  Figure 2H shows the frequency of CCPs in cells treated with CCDC32 siRNA are shifted to shorter lifetimes. We have clarified this in the text.

      - Figure S1: It is now clear, why the mutant versions of CCDC32 are not detected in this western blot. However, data that show the resistance of these proteins to siCCDC32 is still missing (S1 A is in the absence of siCCSC32 I assume, as the legend suggests). A western blot using an anti-GFP antibody, as the one used in Figure S1, after siRNA knock-known would provide clarity.

      That these constructs all contain the same mutation in the siRNA target sequence gives us confidence that they are indeed resistant to siRNA.

      - Note that the anti-CCDC32 antibody does not detect the eGFP-CCDC32(∆78-98) as well as full-length and is unable to detect eGFP-CCDC32(1-54)'. This phrase should belong to Figure S1 (B), not (A)

      Corrected.

      - The immunoprecipitations of CCDC32 and its mutants with AP2 and its subunits are partially confusing. In Figure 5, the authors show that CCDC32 interacts specifically with the alpha-AD, but not with the beta-AD of AP2. In Figure 6B and C, on the other hand, Co-IPs are shown also with the beta and the mu domain of AP2. This is understandable in the context of the full AP2. However, when interaction with the alpha domain (and sigma) is abolished through mutation of helix 78-98, why would beta and mu still interact, when the beta-AD cannot interact with CCDC32 on its own. Are there interaction sites expected outside the ADs in the beta or mu domains?

      See responses to reviewer 1 above.  This result likely reflects the co-chaperone activity of CCDC32 as reported by Wan et al it likely due to their reported interactions of CCDC32 with the µ2 subnit of b2:µ2 hemicomplexes.

      - Figure S6 D, E and F: How much confidence do the authors have on the AlphaFold predictions? Have the same binding poses been obtained repeatedly by independent predictions?

      We provide, with a color scale, the confidence score for each interaction, which is very high (>90%). Of course, this is still a prediction that will need to be verified by further structural studies as we have stated.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      Cook et al. have presented an important study on the transcriptomic and epigenomic signature underlying craniofacial development in marsupials. Given the lack of a dunnart genome, the authors also prepared long and short-read sequence datasets to assemble and annotate a novel genome to allow for the mapping of RNAseq and ChIPseq data against H3K4me3 and H3K27ac, which allowed for the identification of putative promoter and enhancer sites in dunnart. They found that genes proximal to these regulatory loci were enriched for functions related to bone, skin, muscle and embryonic development, highlighting the precocious state of newborn dunnart facial tissue. When compared with mouse, the authors found a much higher proportion of promoter regions aligned between species than for enhancer regions, and subsequent profiling identified regulatory elements conserved across species and are important for mammalian craniofacial development. In contrast, the identification of dunnart-specific enhancers and patterns of RNA expression further confirm the precocious state of muscle development, as well as for sensory system development, in dunnart suggesting that early formation of these features are critical for neonate marsupials likely to assist with detecting and responding to cues that direct the joeys to the mother's teat after birth. This is one of the few epigenomic studies performed in marsupials (of any organ) and the first performed in fat-tailed dunnart (also of any organ). Marsupials are emerging as an important model for studying mammalian development and evolution and the authors have performed a novel and thorough analysis, impressively including the assembly of a new marsupial reference genome that will benefit many future studies.

      Strengths:

      The study provides multiple pieces of evidence supporting the important role enhancer elements play in mammalian phenotypic evolution, namely the finding of a lower proportion of peaks present in both dunnart and mouse for enhancers than for promoters, and dunnart showing more genes uniquely associated with it's active enhancers than any other combination of mouse and dunnart samples, whereas this pattern was less pronounced than for promoter-associated genes. In addition, rigorous parameters were used for the cross-species analyses to identify the conserved regulatory elements and the dunnart-specific enhancers. For example, for the results presented in Figure 1, I agree that it is a little surprising that the average promoter-TSS distance is greater than that for enhancers, but that this could be related to the possible presence of unannotated transcripts between genes. The authors addressed this well by examining the distribution of promoter-TSS distances and using proximal promoters (cluster #1) as high confidence promoters for downstream analyses.

      The genome assembly method was thorough, using two different long read methods (Pacbio and ONT) to generate the long reads for contig and scaffold construction, increasing the quality of the final assembled genome.

      Weaknesses:

      Biological replicates of facial tissue were collected at a single developmental time point of the fat-tailed dunnart within the first postnatal day (P0), and analysed this in the context of similar mouse facial samples from the ENCODE consortium at six developmental time points, where previous work from the authors have shown that the younger mouse samples (E11.5-12.5) approximately corresponds to the dunnart developmental stage (Cook et al. 2021). However, it would be useful to have samples from at least one older dunnart time point, for example, at a developmental stage equivalent to mouse E15.5. This would provide additional insight into the extent of accelerated face development in dunnart relative to mouse, i.e. how long do the regulatory elements that activated early in dunnart remain active for and does their function later influence other aspects of craniofacial development?

      We thank the reviewer for their feedback and agree that the inclusion of multiple postnatal stages in the dunnart would give further valuable insights to the comparative analyses. Unfortunately, we were limited by the pouch young available and prioritized ensuring robust data at a single stage for this study. We hope to expand this work to more stages in future studies.

      The authors refer to the development of the CNS being delayed in marsupials relative to placental mammals, however, evidence shows how development of the dunnart brain (whole brain or cortex) is protracted compared to mouse, by a factor of at least 2 times, rather than delayed per se (Workman et al. 2013; Paolino et al. 2023). In addition, there is evidence that cortical formation and cell birth may begin at approximately the same stage across species equivalent to the neonate period in dunnart (E10.5 in mouse), and that shortly after this at the stage equivalent to mouse E12.5, the dunnart cortex shows signs of advanced neurogenesis followed by a protracted phase of neuronal maturation (Paolino et al. 2023). Therefore, it is possible that marsupial CNS development appears delayed relative to mouse but instead begins at the same stage and then proceeds to develop on a different timing scale.

      The comparison here is not directly between CNS development in placental and marsupials but CNS development relative to development of a subset of structures of the cranial skeleton and musculature (as first proposed by Kathleen Smith 1997). For example, Smith 1997 found that in eutherians, evagination of the telencephalon and appearance of the pigment in the eye occur before the ossification of the premaxilla, maxilla, and dentary. However, in marsupials, evagination of the telencephalon and appearance of the pigment in the eye occur concurrently with condensation of cartilage in the basicranium and the ossification of the premaxilla, maxilla, and dentary. Smith 1997 reports both a delay in the initiation of CNS development in marsupials relative to craniofacial ossification and a protraction of CNS development compared to placental mammals.

      This also highlights the challenges of correlating different staging systems between placentals and marsupials as stages determined as equivalent can change depending on which developmental events are used. The protracted development of the CNS in marsupials (Smith 1997, Workman et al. 2013; Paolino et al. 2023) still supports the hypothesis that during the short gestation period in marsupials structures required for life outside the womb in an embryonic-like state, such as the orofacial region, are likely prioritized.

      We have clarified this based on the reviewers feedback and added text referring to the protraction of marsupial CNS development to the Discussion section.

      [New text]: Marsupials display advanced development of the orofacial region relative to development of the central nervous system when compared to placental mammals[3,6].

      [New text]: Although development of the central nervous system is protracted in marsupials compared to placentals, marsupials have well-developed peripheral motor nerves and sensory nerves (eg. the trigeminal) at birth [5].

      Reviewer #2 (Public review):

      This study by Cook and colleagues utilizes genomic techniques to examine gene regulation in the craniofacial region of the fat-tailed dunnart at perinatal stages. Their goal is to understand how accelerated craniofacial development is achieved in marsupials compared to placental mammals.

      The authors employ state-of-the-art genomic techniques, including ChIP-seq, transcriptomics, and high-quality genome assembly, to explore how accelerated craniofacial development is achieved in marsupials compared to placental mammals. This work addresses an important biological question and contributes a valuable dataset to the field of comparative developmental biology. The study represents a commendable effort to expand our understanding of marsupial development, a group often underrepresented in genomic studies.

      The dunnart's unique biology, characterized by a short gestation and rapid craniofacial development, provides a powerful model for examining developmental timing and gene regulation. The authors successfully identified putative regulatory elements in dunnart facial tissue and linked them to genes involved in key developmental processes such as muscle, skin, bone, and blood formation. Comparative analyses between dunnart and mouse chromatin landscapes suggest intriguing differences in deployment of regulatory elements and gene expression patterns.

      Strengths

      (1) The authors employ a broad range of cutting-edge genomic tools to tackle a challenging model organism. The data generated - particularly ChIP-seq and RNA-seq from craniofacial tissue - are a valuable resource for the community, which can be employed for comparative studies. The use of multiple histone marks in the ChIP-seq experiments also adds to the utility of the datasets.

      (2) Marsupial occupy an important phylogenetic position, but they remain an understudied group. By focusing on the dunnart, this study addresses a significant gap in our understanding of mammalian development and evolution. Obtaining enough biological specimens for these experiments studies was likely a big challenge that the authors were able to overcome.

      (3) The comparison of enhancer landscapes and transcriptomes between dunnarts and can serve as the basis of subsequent studies that will examine the mechanisms of developmental timing shifts. The authors also carried out liftover analyses to identify orthologous enhancers and promoters in mice and dunnart.

      Weaknesses and Recommendations

      (1) The absence of genome browser tracks for ChIP-seq data makes it difficult to assess the quality of the datasets, including peak resolution and signal-to-noise ratios. Including browser tracks would significantly strengthen the paper by provide further support for adequate data quality.

      We have put together an IGV session with the dunnart genome, annotation and ChIP-seq tracks. This is now available in the FigShare data repository (10.7554/eLife.103592.1).

      (2) The first two figures of the paper heavily rely in gene orthology analysis, motif enrichment, etc, to describe the genomic data generated from the dunnart. The main point of these figures is to demonstrate that the authors are capturing the epigenetic signature of the craniofacial region, but this is not clearly supported in the results. The manuscript should directly state what these analyses aim to accomplish - and provide statistical tests that strengthen confidence on the quality of the datasets.

      As this is the first epigenomic profiling for this species we performed extensive data quality control (See Supplementary Tables 2-3, 18, 20-23 and Supplementary Figures 1-3, 6-11). These figures and corresponding Supplementary Tables show the robustness of the data, including well-described metrics for assessing promoters and enhancers, GO terms relevant to craniofacial development and binding motifs for key developmental TF families.

      We have emphasised this aspect of the work more strongly in the results section, particularly in [Defining craniofacial putative enhancer- and promoter regions in the dunnart].

      (3) The observation that "promoters are located on average 106 kb from the nearest TSS" raises significant concerns about the quality of the ChIP-seq data and/or genome annotation. The results and supplemental information suggest a combination of factors, including unannotated transcripts and enhancer-associated H3K4me3 peaks - but this issue is not fully resolved in the manuscript. The authors should confirm that this is not caused by spurious peaks in the CHIP-seq analysis - and possibly improve genome annotation with the transcriptomic datasets presented in the study.

      Spurious ChIP-seq peaks could be possible as there is no “blacklisted regions” database for the dunnart to filter on, however we used a no-IP control, a stringent FDR of 0.01 and peaks had to be reproducible in two biological replicates when calling peaks - all of which should reduce the likelihood of false positives.

      H3K4me3 activity at enhancers is well-established, in particular when enhancer sequences are also bound by RNA Pol II ((Koch and Andrau, 2011; Pekowska et al., 2011). However, compared to H3K4me3 activity at promoters, H3K4me3 levels at enhancers are low (Calo and Wysocka, 2013). This is in line with our observations that H3K4me3 levels at enhancers are much lower than observed at promoter regions (see Supplementary Note 2). We found that H3K4me3 peaks located closer to the TSS had a stronger peak signal (mean = 46.10) than distal H3K4me3 peaks (mean = 6.95; Wilcoxon FDR-adjusted p < 2.2 x 10<sup>-16</sup>). This suggests that although some distal promoter peaks may be due to missingness in the annotation, the majority likely represent peaks associated with enhancer regions. We have emphasized this finding more strongly in the results section:

      [New text]: H3K4me3 activity at enhancers is well-established[25,26], however, compared to H3K4me3 activity at promoters, H3K4me3 levels at enhancers are low[27]. This is in line with our observations where H3K4me3 levels at distal enhancer peaks are nearly 7 times lower than those observed at promoter regions (see SupNote2).

      (4) The comparison of gene regulation between a single dunnart stage (P1) and multiple mouse stages lacks proper benchmarking. Morphological and gene expression comparisons should be integrated to identify equivalent developmental stages. This "alignment" is essential for interpreting observed differences as true heterochrony rather than intrinsic regulatory differences.

      Given the developmental differences between eutherian and marsupial mammals it is challenging to assign the dunnart a precise “equivalent” developmental stage to the mouse. From our morphological and developmental characterisation (see Cook et al. 2020 Nat Comms Bio) based on ossification patterns the dunnart orofacial region on the day of birth appears to be similar to that of an E12.5 mouse embryo (just prior to the observation of ossified craniofacial bones). However, when we compared both regulatory elements and expressed genes between the dunnart at this stage (P1) and 5 developmental stages in the mouse, there is no obvious equivalent stage. For example, when we simply compare genes linked to enhancer peaks, the group with the largest intersection between dunnart and any mouse stage are ~500 genes that are present in dunnart, and mouse stages E10.5, E12.5 - E15.5, Figure 5B). When we then compare genes expressed in the dunnart to temporal gene expression dynamics during mouse development we find that the largest overlap is with genes highly expressed at E14.5 or E15.5 in the mouse (Figure 6, Supplementary Figure 5). We have strengthened the rationale for the selected mouse stages in the comparative analyses section of the results.

      (5) The low conservation of putative enhancers between mouse and dunnart (0.74-6.77%) is surprising given previous reports of higher tissue-specific enhancer conservation across mammals. The authors should address whether this low conservation reflects genuine biological divergence or methodological artifacts (e.g., peak-calling parameters or genome quality). Comparisons with published studies could contextualize these findings.

      The reported range (0.74 - 6.77%) refers to the number regions called as an active enhancer peak in both species (conserved activity) divided by the total number of dunnart peaks alignable to the mouse genome, which we expect to be low given sequence turnover rates and the evolutionary distance separating dunnart and mice. The alignability (conserved sequence) for dunnart enhancers to the mouse genome was ~13% for 100bp regions and can be found in Supplementary Table 22, we have now clarified this in the main text.

      [New Text]: After building dunnart-mm10 liftover chains (see Methods and SupNote5) we compared mouse and dunnart regulatory elements. The alignability (conserved sequence) for dunnart enhancers to the mouse genome was ~13% for 100bp regions (Supplementary Table 22).

      The activity conservation range reported here is consistent with previously reported for marsupial-placental enhancer comparisons (Villar et al. 2015), where ~1% of conserved liver-specific human enhancers had conserved activity to opossum. Follow up studies in Berthelot et al 2018 also found that approximately 1% of human liver enhancers were conserved across the placental mammals included in the study.

      (6) Focusing only on genes associated with shared enhancers excludes potentially relevant genes without clear regulatory conservation. A broader analysis incorporating all orthologous genes may reveal additional insights into craniofacial heterochrony.

      We appreciate the reviewers comment, we understand that a broader analysis may provide some additional insights to this question however in this study our focus was understanding the enhancers driving craniofacial development in these species. We linked enhancers with gene expression data as additional evidence of regulatory programs involved in craniofacial development. The majority (~70%) of genes reproducibly expressed were linked to an active enhancer and/or promoter.   This has now been highlighted in the result section.

      [New Text]: There were 12,153 genes reproducibly expressed at a level > 1 TPM across three biological replicates, with the majority of genes 67% of genes expressed (67%; 8158/12153) associated with near an active enhancer and/or promoter peak.

      In conclusion, this study provides an important dataset for understanding marsupial craniofacial development and highlights the potential of genomic approaches in non-traditional model organisms. However, methodological limitations, including incomplete genome annotation and lack of developmental benchmarking weaken the robustness and of the findings. Addressing these issues would significantly enhance the study's utility to the field and its ability to support the study's central conclusion that dunnart-specific enhancers drive accelerated craniofacial development.

      Reviewer #1 (Recommendations for the authors):

      Minor comments and corrections:

      (1) ChIP-seq FRiP fractions were much higher in dunnart samples than in mouse. Is this related to any differences in sample preparation they are aware of in the ENCODE datasets of mouse, such as different anti-histone antibodies used (and therefore different efficiency of binding to the same histone markers across species)? The authors appear to have addressed something similar with respect to the much lower enriched peak number observed in the mouse sample relative to dunnart in Supp note 4. I suspect the "technical cofounder" they refer to there is affecting both the FRiP scores and the higher correlation coefficients between IP and input in mouse.

      We chose the same antibodies used in the mouse craniofacial tissue ENCODE experiments however, the procedure is slightly different. We used the MAGnify Chromatin Immunoprecipitation System while in the ENCODE assays performed by Bing Ren’s group in 2012 was an in-house lab protocol for MicroChIP. Given that the samples for mouse and dunnart were not processed together, by the same researcher, with the same protocol there could be any number of technical cofounders impacting enrichment. A low FRiP score suggests low specificity as the majority of reads are in non-specific regions (low enrichment), consistent with the higher correlation between IP and input in mouse. The data quality also appears to vary between H3K27ac and H3K4me3 in the mouse (Supplementary Table 21), with H3K4me3 FRiP scores more similar to those observed in our dunnart experiments. This suggests a potential confounder specific to the mouse H3K27ac IP. QC metrics (FRiP, bam correlation) are consistent between H3K27ac and H3K4me3 IPs in our experiments (Supplementary Table 20).

      (2) Some of the promoter peak numbers in Supp table 1 do not match the numbers in the main text.

      We have corrected the incorrect number reported in the text for promoter peaks with orthologous genes (8590 -> 8597).

      (3) In Supp tables 2 and 3, the number of GO terms similar across tables is 466, which is ~42% of total number of enriched GO terms. However the authors mention that only 23% of terms were the same between promoters and enhancers, and a value of 42% was applied to the proportion of terms uniquely enriched for terms associated with genes assigned to promoters only. Unless I'm reading these Supp tables incorrectly, is it possible the proportions were mixed up?

      Thanks for catching this. The lists provided in Supplementary Table 2 were incorrect. The Supplementary Tables and in text description has been corrected to reflect this.

      (4) Would be helpful to add a legend for the mouse samples in Supp Figure 10.

      We have added the labels to the plot.

      (5) In Supp note 5, regarding the percentage of alignable peaks recovered, the percentages mentioned for the 50bp and 500bp peak summit lengths for enhancers and promoters do not seem to match the values in Supp tables 22 and 23.

      Thank you for catching this - we have corrected the Supplementary Tables and in text.

      (6) Please provide additional information to explain how dunnart RNA expression was associated with the five temporal expression clusters found in the mouse data shown in Figure 6 given there is only one dunnart time point and so the species temporal pattern's could not be compared, i.e. how was the odds ratio calculated and was this applied iteratively for dunnart against each mouse age and within each temporal cluster?

      The TCseq package takes the mouse expression data across all 6 stages and calls differentially expressed genes with an absolute log<sub>2</sub> fold-change > 2 compared to the starting time-point (E10.5). The mouse gene expression patterns were clustered into 5 clusters that each show distinct temporal expression patterns (see Supplementary Figure 5D). The output from this is 5 lists where within each list are unique genes that share a temporal pattern. These lists of mouse genes were then each compared to the orthologous genes expressed in the dunnart using a Fishers Exact test with corrections for multiple testing using the Holm method. We have added additional details in the methods:

      [New text]: Orthologous genes reproducibly expressed >1 TPM in the dunnart were compared to the list of genes for each cluster using Fisher’s Exact Test followed by p-value corrections for multiple testing with the Holm method.

      (7) SupFile1 and SupFile2 - which supplementary note or figure are these referring to?

      Apologies for this error. These items were meant to link to the FigShare repository where the supplementary files can be found. We have corrected this using the DOI for the repository.

      Reviewer #2 (Recommendations for the authors):

      (1) Authors should clarify that the mouse ENCODE data used for the comparisons was obtained from craniofacial tissue.

      This has now been corrected to clarify that the mouse ENCODE data used was from craniofacial tissues. ENCODE mouse embryonic facial prominence ChIP-seq and gene expression quantification file accession numbers and details used in study can be found in Supplementary Table 17.

      (2) Given the large differences in TPM for highly expressed genes shown in Figure 5, a MA or volcano plot would provide a more comprehensive view of global transcriptome differences between species.

      We have added this plot as Supplementary Figure 13.

      (3) It is unclear whether the enrichment analysis was performed for mouse genes, dunnart genes, or both.

      In reference to Figure 5, Gene Ontology enrichment analysis was performed on the top 500 highly expressed genes in dunnart. Because there is not an ontology database for dunnart gene IDs, these top 500 dunnart gene IDs were converted to the orthologous gene ID in mouse before performing the enrichment analysis. We apologise for the lack of clarity and have added additional text in the results section to make this clearer. In addition, the relevant methods section now reads:

      [New text]: As there is no equivalent gene ontology database for dunnart, we converted the Tasmanian devil RefSeq IDs to Ensembl v103 using biomaRt v2.46.3 and then converted these to mouse Ensembl v103 IDs. In this way we were able to use the mouse Ensembl Gene Ontology annotations for the dunnart gene domains. All gene ontology analyses were performed using clusterProfiler v4.1.4[117], with Gene Ontology from the org.Mm.eg.db v3.12.0 database[118], setting an FDR-corrected p-value threshold of 0.01 for statistical significance.

    2. eLife Assessment

      This important study of regulatory elements and gene expression in the craniofacial region of the fat-tailed dunnart shows that, compared to placental mammals, marsupial craniofacial tissue develops in a precocious manner, with enhancer regulatory elements as primary driver of this difference. The compelling data, including a new dunnart genome assembly, provide an invaluable reference for future mammalian evolution studies, especially once additional developmental time point for the fat-tailed dunnart become available.

    3. Reviewer #1 (Public review):

      Summary:

      Compared to placental mammals, marsupials have a short gestation period and give birth to altricial young. To assist with the detection and response to cues that direct the neonate joeys to the mother's pouch, as well as latching onto the teat, marsupial craniofacial development at this stage is rapid and heterochronous relative to placentals. Cook et al. have presented an important study on the transcriptomic and epigenomic signature underlying this heterochronous development of craniofacial features across mammals, using the fat-tailed dunnart as a marsupial model.

      Given the lack of a dunnart genome, the authors prepared long and short read sequence datasets to assemble and annotate a novel genome to allow for mapping of RNAseq and ChIPseq data against H3K4me3 and H3K27ac, which allowed for identification of putative promoter and enhancer sites in dunnart. They found that genes proximal to these regulatory loci were enriched for functions related to bone, skin, muscle and embryonic development, verifying the precocious state of newborn dunnart facial tissue. When compared with mouse, the authors found a much higher proportion of promoter regions aligned between species than for enhancer regions, and subsequent profiling identified regulatory elements conserved across species and are important for mammalian craniofacial development. In contrast, identification of dunnart-specific enhancers and patterns of RNA expression further confirm the precocious state of muscle development, as well as for sensory system development, in dunnart, suggesting that early formation of these features are critical for neonate marsupials.

      Marsupials are emerging as an important model for studying mammalian development and evolution, and the authors have performed a novel and thorough analysis that helps to elucidate the regulatory profile underlying craniofacial heterochrony. Impressively, this study also includes the assembly of a new marsupial reference genome that will benefit many future studies of mammalian developmental biology.

      Strengths:

      The genome assembly method was thorough, using two different long-read methods (Pacbio and ONT) to generate the long reads for contig and scaffold construction, increasing the quality of the final assembled genome, which was effectively annotated and used for functional analysis of orthologous regulatory elements.

      The birth of altricial young in marsupials is an important feature of their development that is distinct from placental mammals which are separated by about 160 million years of evolution. Very little is known, however, about the regulatory profile that contributes to the advanced craniofacial development required for joey survival. This is one of the few epigenomic studies performed in marsupials (of any organ) and the first performed in fat-tailed dunnart (also of any organ), which begins to address this lack of knowledge.

      The study also provides evidence supporting the important role enhancer elements play in mammalian phenotypic evolution, relative to promoters.

      Weaknesses:

      Biological replicates of facial tissue were collected at a single developmental time point of the fat-tailed dunnart within the first postnatal day (P0), and analysed this in the context of similar mouse facial samples from the ENCODE consortium at six developmental time points, where previous work from the authors have shown that the younger mouse samples (E11.5-12.5) approximately corresponds to the dunnart developmental stage (Cook et al. 2021). However, it would be useful to have samples from at least one older dunnart time point, for example, at a developmental stage equivalent to mouse E15.5. This would provide additional insight into the extent of accelerated face development in dunnart relative to mouse, i.e. how long do the regulatory elements that are activated early in dunnart remain active for and does their function later influence other aspects of craniofacial development?

    1. eLife Assessment

      This study presents a valuable comparison of the efficiency and precision of two prime editing methods to introduce single-nucleotide variants and longer exogenous DNA sequences into the zebrafish genome. Solid data support the conclusion that the PE2 prime editor Nickase is more effective at introducing single-nucleotide variants, while the PEn prime editor nuclease is more effective at integrating short sequences from 3 up to 30 base pairs, for both somatic and germline editing. The results will be of interest to the zebrafish community, in particular to model human disease variants in this model organism.

    2. Reviewer #1 (Public review):

      Ono et al. compared the activity of prime editor Nickase PE2 and prime editor nuclease PEn in introducing SNPs and short exogenous DNA sequences into the zebrafish genome to model human disease variants. They find the nickase PE2 prime editor had a higher rate of precise integration for introducing single-nucleotide substitutions, whereas the nuclease PEn prime editor showed improved precision of integration of short DNA sequences. In somatic tissue, the percentage of SNP variant precision edits improved when using PE2 RNP injection instead of mRNA injection, but increased precision editing correlated with elevated indel formation. While PEn overall had higher rates of precision edits, the indel rate was also elevated. Similar rates were observed when introducing a 3 bp stop codon into the ror gene using a standard pegRNA with a 13-nucleotide homology arm, or a springRNA lacking the homology arm that drives integration via NHEJ. Inclusion of an abasic sequence in the springRNA prevented imprecise edits caused by scaffold incorporation, but did not improve the overall percentage of precise edits in somatic tissue. Recovery of a germline ror-TGA integration allele using PEn with RNP was robust, resulting in 5 out of 10 founders transmitting a precise allele. Lastly, the authors demonstrate that PEn was effective at the integration of a 30 bp nuclear localization signal into the 5' end of GFP in an existing muscle-specific reporter line. However, the undefined number of cassettes in this multicopy transgene complicates accurate measurements of editing frequency. Integration of the NLS or other longer sequences at an endogenous locus would demonstrate the broad utility of this approach. From the work presented, it is unclear how prime editing could be used to transiently model human pathogenic variants, given the low frequency of precision edits in somatic tissue, or to isolate stable germline alleles of variants that are potentially dominant negative or gain-of-function in nature. Without a direct comparison with CRISPR/Cas9 nuclease HDR-based methods that use oligonucleotide templates to introduce edits, the advantage of prime editing is unclear. A cost comparison between prime editing and HDR methods would also be of interest, particularly for integration of longer DNA sequences.

      The conclusions of the paper are mostly well supported, but some changes to the text and additional analyses would strengthen the conclusion that PE2 vs. PEn is preferred for introducing variants, short or long DNA sequences.

      (1) In Figure 3, the data indicate a significant increase in precise edits of the 3 bp TGA using PE2 RNP (11.5%) vs. PE2 mRNA (1.3%). At the adgrf3b locus, only PEn mRNA was tested for introducing the 3 bp and 12 bp insertions. The previous study testing PE2 for 3 and 12 bp insertions was mentioned, but the frequency was not listed, and the study wasn't cited (lines 204 - 207). A comparison of germline transmission rates using PE2 vs. PEn would support the conclusion that PEn allows precise integration of longer templates and recovery of germline integration alleles.

      (2) Figure 4 shows the results of introducing a TGA stop codon that is predicted to result in nonsense-mediated decay. Testing the ability to also isolate different substitution mutations in the germline would be useful information for identifying the most effective approach for generating human disease variant models.

      (3) A comparison with the prime editing variant knock-in frequencies reported in the recent publication by Vanhooydonck et al., 2025, Lab Animal should be included in the Discussion.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript provides a comparison of nickase-based (PE2) and nuclease-based (PEn) Prime Editors in zebrafish, evaluating their efficiencies for substitutions, short insertions (3-30 bp), and germline transmission.

      Strengths:

      The manuscript has demonstrated for the first time that nuclease-based PEn more efficiently inserts nucleotide sequences up to 30 bp (nuclear localization sequence) than PE2, providing an improvement for the application of gene editing in functional genetics research. Additionally, the demonstration of stable zebrafish lines with edited ror2 and smyhc1:gfp loci is well-supported by sequencing and phenotypic data, confirming functional consequences of edits.

      Weaknesses:

      The study lacks conceptual innovation, as the central methodology-RNP-based Prime Editor delivery in zebrafish-was previously established by Petri et al. (2022). The present study extends this by testing longer insertions (30 bp) with nuclease-based PEn, but this incremental advance does not substantially shift the field's understanding or capabilities. The manuscript does not sufficiently differentiate its contributions from these precedents.

      The comparative analysis between PE2 and PEn systems suffers from limited evidentiary support. The comparison relies on single loci for substitutions (crbn) and insertions (ror2), raising concerns about generalizability. Additional validation across multiple loci is necessary to support broad conclusions about PE2/PEn performance.

    4. Reviewer #3 (Public review):

      The manuscript by Ono et al describes the application of prime editors to introduce precise genetic changes in the zebrafish model system. Probably the most important observation is that, compared to the "standard" PE2, the prime editor with full nuclease activity appears to be more efficient at introducing insertions into the genome. Although many laboratories around the world have successfully used oligonucleotide-mediated HDR to insert short exogenous sequences such as epitope tags or loxP sites into the zebrafish genome, the method suffers from a high frequency of indels at the edit site. Thus, additional tools are badly needed, making this manuscript very important. Length of the longer reported insertion (+30) is quite close to the range of V5 (14 amino acids) and ALFA (12 amino acids without "spacer" prolines) epitope tags, as well as loxP site (34 nucleotides). Conclusions drawn in the paper are supported by compelling evidence. I only have a few minor comments:

      (1) The logic for introducing two nucleotide changes (at +3 and +10) to change a single amino acid (I378) should be explicitly explained in the main body of the manuscript. It is indeed self-explanatory when looking at Supplementary Figure 1. One way of doing it could be to include Supplementary Figure 1a in Figure 1.

      (2) It is not clear why a 3-nucleotide insertion was used to generate W722X. The human W720X is a single-nucleotide polymorphism, and it should be possible to make a corresponding zebrafish mutant by introducing two nucleotide changes.

      (3) Lines 137-138: T7 Endonuclease assay used in Figure 2d detects all polymorphisms, both precise changes and indels. Thus, if this assay were performed on embryos shown in Figure 1c-d, the overall percentage of modified alleles would be similarly higher for PEn over PE2 (add up precise prime edits and indels). The conclusion in the last sentence of the paragraph is, therefore, incorrect, I believe.

      (4) Use of terminology. "Germline transmission" is typically used to refer to the fraction of F0s transmitting desired changes (or transgenes) to their progeny, while "germline mosaicism" refers to the fraction of F1s with the desired change in the progeny of a given F0. "Germline transmission" in line 217 should be replaced with "germline mosaicism".

      (5) Lines 253-255: The fraction of injected embryos that had mosaic nuclear expression of GFP, indicative of NLS insertion, should be clarified. It should also be clarified whether embryos positive for nuclear GFP were preselected for amplicon sequencing and germline transmission analyses. This is extremely important for extrapolation to scenarios like epitope tagging, where preselection is not possible.

      (6) Statistical analyses. It would be helpful to clarify why different statistical tests are sometimes used to assess seemingly very similar datasets (Figures 1c, 1d, 2b, 2c, 2f).

      (7) Discussion. Since authors suggest that PEn might be especially beneficial for insertion of additional sequences, it is important to stress locus-to-locus variability of success. While the precise +3 insertion was indeed tremendously efficient at both tested loci (ror2 and adgrf3b), +12 addition into adgrf3b was over 10 times less efficient (lines 193-194). In contrast, +30 into smyhc:GFP using the shorter pegRNA was highly efficient again with an average of 8.5% of sequence reads indicating precise integration (line 257, Figure 5c). Longer pegRNA did not work nearly as well (Figure 5c), but was still much better than +12 into adgrf3b. As dangerous as it is to extrapolate from small datasets, perhaps these observations indicate that optimization of RT template and PBS may be needed for each new locus in order to significantly outperform oligonucleotide-mediated HDR? If so, would the cost of ordering several pegRNAs and the effort needed to compare them factor in when deciding which method to use? Reported germline transmission rates for both ror2 W722X (+3, Figure 4a) and smyhc:NLS-GFP (+30, Figure 5f) are tantalizingly high.

    1. eLife Assessment

      This important study demonstrates that disruption of a common protein-folding system renders drug-resistant clinical bacteria susceptible to antibiotics. The work convincingly shows that targeting protein folding can be used to combat multidrug-resistant pathogens, both by potentiating the efficacy of existing drugs and by therapeutic use of small-molecule inhibitors. This study is significant and timely as it informs on a new strategy that is relevant to microbiologists and clinicians interested in combating antimicrobial resistance.

    2. Reviewer #1 (Public review):

      Summary:

      In this work the authors provide evidence that impairment of cell envelope protein homeostasis through blocking the machinery for disulfide bond formation restores efficacy of antibiotics including beta-lactam drugs and colistin against AMR in Gram-negative bacteria.

      Strengths:

      The authors employ a thorough approach to showcase the restoration of antibiotic sensitivity through inhibition of the DSB machinery, including the evaluation of various antibiotics on both normal and Dsb-deficient pathogenic bacteria (i.e. Pseudomonas and Stenotrophomonas). The authors corroborate these findings by employing Dsb inhibitors in addition to delta dsbA strains. The methodology is appropriate and includes measuring MICs as well as validating their observations in vivo using the Galleria model.

    3. Reviewer #2 (Public review):

      Summary:

      This work by Kadeřábková and Furniss et al. demonstrates the importance of a specific protein folding system to effectively folding β-lactamase proteins, which are responsible for resistance to β-lactam antibiotics, and shows that inhibition of this system sensitize multidrug-resistant pathogens to β-lactam treatment. In addition, the authors extend these observations to a two-species co-culture model where β-lactamases provided by one pathogen can protect another, sensitive pathogen from β-lactam treatment. In this model, disrupting the protein folding system also disrupted protection of the sensitive pathogen from antibiotic killing. Overall, the data presented provide a convincing foundation for subsequent investigations and development of inhibitors for β-lactamases and other resistance determinants. This and similar strategies may have application to polymicrobial contexts when molecular interactions are suspected to confer resistance to natively antibiotic-sensitive pathogens.

      Strengths:

      The authors use clear and reliable molecular biology strategies to show that β-lactamase proteins from P. aeruginosa and Burkholderia species, expressed in E. coli in the absence of the dsbA protein folding system, are variably less capable of resisting the effects of different β-lactam antibiotics compared to the dsbA-competent parent strain (Figure 1). The appropriate control is included in the supplemental materials to demonstrate that this effect is specifically dependent on dsbA, since complementing the mutant with an intact dsbA gene restores antibiotic resistance (Figure S1). The authors subsequently show that this lack of activity can be explained by significantly reduced protein levels and loss-of-function protein misfolding in the dsbA mutant background (Figure 2). These data support the importance of this protein folding mechanism in the activity of multiple clinically relevant β-lactamases.

      Native bacterial species are used for subsequent experiments, and the authors provide important context for their antibiotic choices and concentrations by referencing the breakpoints that guide clinical practice. In Figure 4, the authors show that loss of the DsbA system in P. aeruginosa significantly sensitizes clinical isolates expressing different classes of β-lactamases to clinically relevant antibiotics. The appropriate control showing that the dsbA1 mutation does not result in sensitivity to a non-β-lactam antibiotic is included in Figure S2. The authors further show, using an in vivo model for antibiotic treatment, that treatment of a dsbA1 mutant results in moderate and near-complete survival of the infected organisms. The importance of this system in S. maltophilia is then investigated similarly (Figure 5), showing that a dsbA dsbL mutant is also sensitive to β-lactams and colistin, another antibiotic whose resistance mechanism is dependent on the DsbA protein folding system. Importantly, the authors show that a small-molecule inhibitor that disrupts the DsbA system, rather than genetic mutations, is also capable of sensitizing S. maltophilia to these antibiotics. It should be noted that while the sensitization is less pronounced, this molecule has not been optimized for S. maltophilia and would be expected to increase in efficacy following optimization. Together, the data support that interference with the DsbA system in native hosts can sensitize otherwise resistant pathogens to clinically relevant antibiotic therapy.

      Finally, the authors investigate the effects of co-culturing S. maltophilia and P. aeruginosa (Figure 5E). These assays are performed in synthetic cystic fibrosis sputum medium (SCFM), which provides a nutritional context similar to that in CF but without the presence of more complex components such as mucin. The authors show that while P. aeruginosa alone is sensitive to the antibiotic, it can survive moderate concentrations in the presence of S. maltophilia and even grow in higher concentrations where S. maltophilia appears to overproduce its β-lactamases. However, this protection is lost in S. maltophilia without the DsbA protein folding system, showing that the protective effect depends on functional production of β-lactamase in the presence of viable S. maltophilia. The authors further achieved the difficult task of labeling these multi-drug resistant pathogens with selection markers to determine co-infection CFUs in the supplemental materials. Overall, the data support a protective role for DsbA-dependent β-lactamase under these co-culture conditions.

      Weaknesses:

      No significant weaknesses are noted beyond the limitations identified and discussed by the authors.

    4. Reviewer #3 (Public review):

      Summary:

      In the face of emerging antibiotic resistance and slow pace of drug discovery, strategies that can enhance the efficacy of existing clinically used antibiotics are highly sought after. In this manuscript, through genetic manipulation of a model bacterium (Escherichia coli) and clinically isolated and antibiotic resistant strains of concern (Pseudomonas, Burkholderia, Stenotrophomonas), an additional drug target to combat resistance and potentiate existing drugs is put forward. These observations were validated in both pure cultures, mixed bacterial cultures and in worm models. The drug target investigated in this study appears to be broadly relevant to the challenge posed by lactamases enzyme that render lactam antibiotics ineffective in the clinic. The compounds that target this enzyme are being developed already, some of which were tested in this study displaying promising results and potential for further optimization by medicinal chemists.

      Strengths:

      The work is well designed and well executed and targets an urgent area of research with the unprecedented increase in antibiotic resistance.

      Weaknesses:

      The impact of the work can be strengthened by demonstrating increased efficacy of antibiotics in mice models or wound models for Pseudomonas infections. Worm models are relevant, but still distant from investigations in animal models.

    5. Author response:

      The following is the authors’ response to the current reviews.

      Reviewer #1 (Recommendation For the Authors):

      Thanks to the authors for addressing my suggestions. I think these modifications have improved the clarity of the data and the overall presentation of the manuscript. The methods are now more clearly explained, and the additional details help make the results easier to interpret. Where addressing the comment wasn't feasible, the authors gave reasonable explanations. Overall, the revisions strengthen the paper, and I have no further concerns.

      Thank you for your recommendations, which have significantly improved our paper.

      Reviewer #2 (Recommendation For the Authors):

      The additional work conducted by the authors is greatly appreciated. All concerns (and beyond) have been thoroughly addressed by the authors and I am thankful for their consideration and attention to detail. Only one possible issue with the revisions is described below for consideration:

      Regarding the CFU counts and/or axis labels in Figure S3B, some of the listed "CFU per 1 mL" values (in both the figure itself and File S2B) are extraordinarily high. For example, the greatest CFU for PA14 observed in Figure 4E is ~1x10^9. However, PA14 at 0 ug/mL Ceftazidime reaches nearly 1x10^16 in Figure S3B. From what I can tell, this should be beyond the capacity of bacteria in this space by several orders of magnitude. (E.g., a cubic centimeter [~1 mL] is ~1x10^12 cubic micrometers. At their smallest dimensions and volume, a maximum of ~1x10^13 cells could theoretically fit in this space assuming no liquid and perfect organization.) Similarly, both "AMM" and "AMM (+PA14)" consistently reach CFUs between 1x10^12 and 1x10^14 in this assay. Are the authors confident in the values and/or depiction of CFUs for this figure? It seems like this could be a labeling or dilutioncounting issue.

      Thank you for your positive remarks on our revised manuscript and for your constructive comments that have strengthened our work.

      We agree with the concern regarding the CFU counts in Figure S3B. The very high values (>10<sup>12</sup>CFU) reflect a technical enumeration artifact that, due to the nature of the assay, cannot be fully avoided. The origin of these inflated counts is described in more detail below:

      Following competition assays between Pseudomonas aeruginosa and Stenotrophomonas maltophilia in liquid culture with antibiotics, we enumerate survivors for each species by colony forming unit (CFU) counts. Because two different bacterial species must be quantified from mixed cultures, we use a gentamicin resistance marker carried by one species at a time.

      Each condition is therefore enumerated twice, as we alternate which species harbors the gentamicin cassette.

      During coculture in antibiotics and minimal medium, clinical isolates of P. aeruginosa and S. maltophilia, like those used here, can transiently increase their tolerance to antibiotics, including aminoglycosides. This reduces the effectiveness of gentamicin selection at the plating step necessary for CFU enumeration. For the data presented in Figure S3B, in a subset of highOD₆₀₀ conditions in the competition assay, this tolerance produces artificially inflated CFU values that exceed the biological carrying capacity during the CFU enumeration step.

      We evaluated alternative enumeration strategies (e.g., fluorescent protein markers with a nonselective medium), but these proved unsuitable for these strains due to differences in growth rates and media compatibility, introducing other large biases. Given these constraints, selective plating remains the only feasible approach for this work, and the associated artifact cannot be eliminated entirely.

      Importantly, transient resistance (tolerance), although common, is not a universal occurrence (e.g., we did not observe it when we performed the experiments shown in Figure 4E). When it does arise, it occurs reproducibly under the same experimental high-OD<sub>600</sub> conditions and does not obscure any of the relative comparisons that underpin our conclusions.

      For transparency, we have retained the measured values in Figure S3B and we note in the legend that counts above ~10<sup>12</sup> CFU represent a technical overestimation due to transient gentamicin tolerance. Counts below 10<sup>12</sup> CFU are accurately enumerated.

      Reviewer #3 (Recommendation For the Authors):

      All concerns have been satisfied and the manuscript is ready for publishing.

      Thank you for your recommendations, which have significantly improved our paper.


      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      The study would benefit from presenting raw data in some cases, such as MIC values and SDS-PAGE gels, by clarifying the number of independent experiments used, as well as further clarification on statistical significance for some of the data.

      All original data used to generate Fig. 1, Fig. 4E, Fig. S3 and Fig. S4A are presented in File S2. Tab (A) is dedicated to data used for Fig. 1 and Fig. S4A, while tabs (B) and (C) show the data used for Fig. 4E and S3, respectively. This information is indicated in the legends of the relevant figures.

      All experiments in this study were performed in three independent (biological) experiments (with the exception of the complementation data shown in Fig. S1 and Fig. S5, which were performed in two independent (biological) experiments). The number of biological and technical replicates for each experiment is stated in the figure legends, as well as in the “Statistical analysis of experimental data” part of the “Materials and Methods” section of the paper. Specifically, for antibiotic MIC assays we have not performed statistical analyses as per recommended practice. The reason for this is stated in the following section from the “Statistical analysis of experimental data” part of the “Materials and Methods” section of the paper (lines 699-711 of the revised manuscript):

      “Antibiotic MIC values were determined in biological triplicate, except for MIC values recorded for dsbA complementation experiments in our E. coli K-12 inducible system that were carried out in duplicate. All ETEST MICs were determined as a single technical replicate, and all BMD MICs were determined in technical triplicate. All recorded MIC values are displayed in the relevant graphs; for MIC assays where three or more biological experiments were performed, the bars indicate the median value, while for assays where two biological experiments were performed the bars indicate the most conservative of the two values (i.e., for increasing trends, the value representing the smallest increase and for decreasing trends, the value representing the smallest decrease). We note that in line with recommended practice, our MIC results were not averaged. This should be avoided because of the quantized nature of MIC assays, which only inform on bacterial survival for specific antibiotic concentrations and do not provide information for antibiotic concentrations that lie in-between the tested values.”

      Reviewer #2 (Public review):

      While Figure 5E demonstrates a protective effect of DsbA-dependent β-lactamase, the omission of CFU data for S. maltophilia makes it difficult to assess the applicability of the polymicrobial strategy. Since S. maltophilia is pre-cultured prior to the addition of P. aeruginosa and antibiotics, it is unclear whether the protective effect is dependent on high S. maltophilia CFU. It is also unclear what the fate of the S. maltophilia dsbA dsbL mutant is under these conditions. If DsbA-deficient S. maltophilia CFU is not impacted, then this treatment will result in the eradication of only one of the pathogens of interest. If the mutant is lost during treatment, then it is not clear whether the loss of protection is due specifically to the production of non-functional β-lactamase or simply the absence of S. maltophilia.

      We have simultaneously tracked the abundance of P. aeruginosa and S. maltophilia strains in our cross-protection experiment for select antibiotic concentrations. To be able to perform this experiment, we had to label two extremely-drug-resistant strains of S. maltophilia with an antibiotic resistance marker that allowed us to quantify them in mixtures with P. aeruginosa. Our results can be found in Fig. S3 of our revised manuscript and, in a nutshell, show that ceftazidime treatment leads to eradication of both P. aeruginosa and S. maltophilia when disulfide bond formation is impaired in S. maltophilia.

      The following text was added to address the questions of the reviewer:

      “Due to the naturally different growth rates of these two species (S. maltophilia grows much slower than P. aeruginosa) especially in laboratory conditions, the protocol we followed [1] requires S. maltophilia to be grown for 6 hours prior to co-culturing it with P. aeruginosa. To ensure that at this point in the experiment our two S. maltophilia strains, with and without dsbA, had grown comparatively to each other, we determined their cell densities (Fig. S3A). We found that S. maltophilia AMM dsbA dsbL had grown at a similar level as the wild-type strain, and both were at a higher cell density [~10<sup>7</sup> colony forming units (CFUs)] compared to the P. aeruginosa PA14 inoculum (5 x 10<sup>4</sup> CFUs)” (lines 353-361 of the revised manuscript).

      “To ensure that ceftazidime treatment leads to eradication of both P. aeruginosa and S. maltophilia when disulfide bond formation is impaired in S. maltophilia, we monitored the abundance of both strains in each synthetic community for select antibiotic concentrations (Fig. S3B). In this experiment we largely observed the same trends as in Fig. 4E. At low antibiotic concentrations, for example 4 μg/mL of ceftazidime, S. maltophilia AMM is fully resistant and thrives, thus outcompeting P. aeruginosa PA14 (dark pink and dark blue bars in Fig. S3B). The same can also be seen in Fig. 4E, whereby decreased P. aeruginosa PA14 CFUs are recorded. By contrast S. maltophilia AMM dsbA dsbL already displays decreased growth at 4 μg/mL of ceftazidime because of its non-functional L1-1 enzyme, allowing comparatively higher growth of P. aeruginosa (light pink and light blue bars in Fig. S3B). Despite the competition between the two strains, P. aeruginosa PA14 benefits from S. maltophilia AMM’s high hydrolytic activity against ceftazidime, which allows it to survive and grow in high antibiotic concentrations even though it is not resistant (see 128 μg/mL; dark pink and dark blue bars in Fig. S3B). In stark opposition, without its disulfide bond in S. maltophilia AMM dsbA dsbL, L1-1 cannot confer resistance to ceftazidime, resulting in killing of S. maltophilia AMM dsbA dsbL and, consequently, also of P. aeruginosa PA14 (see 128 μg/mL; light pink and light blue bars in Fig. S3B).

      The data presented here show that, at least under laboratory conditions, targeting protein homeostasis pathways in specific recalcitrant pathogens has the potential to not only alter their own antibiotic resistance profiles (Fig. 3 and 4A-D), but also to influence the antibiotic susceptibility profiles of other bacteria that co-occur in the same conditions (Fig. 5). Admittedly, the conditions in a living host are too complex to draw direct conclusions from this experiment. That said, our results show promise for infections, where pathogen interactions affect treatment outcomes, and whereby their inhibition might facilitate treatment” (lines 381406 of the revised manuscript).

      The alleged clinical relevance and immediate, theoretical application of this approach should be properly contextualized. At multiple junctures, the authors state or suggest that interactions between S. maltophilia and P. aeruginosa are known to occur in disease or have known clinical relevance related to treatment failure and disease states. For instance, the citations provided for S. maltophilia protection of P. aeruginosa in the CF lung environment both describe simplified laboratory experiments rather than clinical or in vivo observations. Similarly, the citations provided for both the role of S. maltophilia in treatment failure and CF disease severity do not support either claim. The role of S. maltophilia in CF is currently unsettled, with more recent work reporting conflicting results that support S. maltophilia as a marker, rather than cause, of severe disease. These citations also do not support the suggestion that S. maltophilia specifically contributes to treatment failure. While it is reasonable to pursue these ideas as a hypothesis or potential concern, there is no evidence provided that these specific interactions occur in vivo or that they have clinical relevance.

      Thank you for your comment. You are entirely correct. We have amended the test throughout our revised manuscript to avoid overstating the role of S. maltophilia in CF infections and to reference additional relevant works in the literature. Please find below representative examples of such passages:

      “On the other hand, CF microbiomes are increasingly found to encompass S. maltophilia [2-4], a globally distributed opportunistic pathogen that causes serious nosocomial respiratory and bloodstream infections [5-7]. S. maltophilia is one of the most prevalent emerging pathogens [6] and it is intrinsically resistant to almost all antibiotics, including β-lactams like penicillins, cephalosporins and carbapenems, as well as macrolides, fluoroquinolones, aminoglycosides, chloramphenicol, tetracyclines and colistin. As a result, the standard treatment option for lung infections, i.e., broad-spectrum β-lactam antibiotic therapy, is rarely successful in countering S. maltophilia [7,8], creating a definitive need for approaches that will be effective in eliminating both pathogens” (lines 33-41 of the revised manuscript).

      “Of the organisms studied in this work, S. maltophilia deserves further discussion because of its unique intrinsic resistance profile. The prognosis of CF patients with S. maltophilia lung carriage is still debated [4,9-16], largely because studies with extensive and well-controlled patient cohorts are lacking. This notwithstanding, the therapeutic options against this pathogen are currently limited to one non-β-lactam antibiotic-adjuvant combination, , which is not always effective, trimethoprim-sulfamethoxazole [17-20], and a few last-line β-lactam drugs, like the fifth-generation cephalosporin cefiderocol and the combination aztreonam-avibactam. Resistance to commonly used antibiotics causes many problems during treatment and, as a result, infections that harbor S. maltophilia have high case fatality rates [7]. This is not limited to CF patients, as S. maltophilia is a major cause of death in children with bacteremia [5]” (lines 440-450 of the revised manuscript).

      Reviewer #3 (Public review):

      The impact of the work can be strengthened by demonstrating increased efficacy of antibiotics in mice models or wound models for Pseudomonas infections. Worm models are relevant, but still distant from investigations in animal models.

      Thank you for this comment. We appreciate the sentiment, and we would have liked to be able to perform experiments in a murine model of infection. There are several reasons that made this not possible, and as a result we used G. mellonella as an informative preliminary in vivo infection model. The DSB proteins have been shown to play a central role in bacterial virulence. Because of this our P. aeruginosa and S. maltophilia mutant strains are not efficient in establishing an infection, even in a wound model. This could be overcome had we been able to use the chemical inhibitor of the DSB system in vivo, however this also is not possible This is due to the fact that the chemical compound that we use to inhibit the function of DsbA acts on DsbB. Inhibition of DsbB blocks the re-oxidation of DsbA and leads to its accumulation in its inactive reduced form. However, the action of the inhibitor can be bypassed through reoxidation and re-activation of DsbA by small-molecule oxidants such as L-cystine, which are abundant in rich growth media or animal tissues. This makes the inhibitor only suitable for in vitro assays that can be performed in minimal media, where the presence of small-molecule oxidants can be strictly avoided, but entirely unsuitable for an insect or a vertebrate animal model.

      Reviewer #1 (Recommendation For the Authors):

      (1) The analysis of the role of DsbA in the assembly of cysteine-containing β-lactamases is a significant finding. However, in addition to showing the MIC fold difference, I think, it would be important to show the raw data for the actual MIC values obtained for each β-lactamase enzyme/antibiotic combination and in both strains (+ and - dsbA).

      Also, can the authors clarify whether these experiments were conducted on 3 independent samples (there seems to be some contradicting information in the paper and the supplementary figures). If possible, I would also recommend showing in the figure whether the MIC differences observed were statistically significant.

      All original data used to generate Fig. 1, Fig. 4E, Fig. S3 and Fig. S4A are presented in File S2. Tab (A) is dedicated to data used for Fig. 1 and Fig. S4A, while tabs (B) and (C) show the data used for Fig. 4E and S3, respectively. This information is indicated in the legends of the relevant figures.

      All experiments in this study were performed in three independent (biological) experiments (with the exception of the complementation data shown in Fig. S1 and Fig. S5, which were performed in two independent (biological) experiments). The number of biological and technical replicates for each experiment is stated in the figure legends, as well as in the “Statistical analysis of experimental data” part of the “Materials and Methods” section of the paper. Specifically, for antibiotic MIC assays we have not performed statistical analyses as per recommended practice. The reason for this is stated in the following section from the “Statistical analysis of experimental data” part of the “Materials and Methods” section of the paper (lines 699-711 of the revised manuscript):

      “Antibiotic MIC values were determined in biological triplicate, except for MIC values recorded for dsbA complementation experiments in our E. coli K-12 inducible system that were carried out in duplicate. All ETEST MICs were determined as a single technical replicate, and all BMD MICs were determined in technical triplicate. All recorded MIC values are displayed in the relevant graphs; for MIC assays where three or more biological experiments were performed, the bars indicate the median value, while for assays where two biological experiments were performed the bars indicate the most conservative of the two values (i.e., for increasing trends, the value representing the smallest increase and for decreasing trends, the value representing the smallest decrease). We note that in line with recommended practice, our MIC results were not averaged. This should be avoided because of the quantized nature of MIC assays, which only inform on bacterial survival for specific antibiotic concentrations and do not provide information for antibiotic concentrations that lie in-between the tested values.”

      (2) For Figure 2A, can the authors provide the full Westerns and ideally the SDS-PAGE gel corresponding to the Westerns where the Β-lactamases and the control DNA-K were detected.

      Thank you for this comment. Full immunoblots and SDS PAGE analysis of the immunoblot samples for total protein content are shown in File S3 of our revised manuscript.

      (3) For the enzymatic assays, was the concentration of enzyme used "normalised " based on the amount detected in the westerns where possible or was only the total amount of protein considered. When similar amounts of enzyme were added, was the activity still compromised?

      The β-lactam hydrolysis assay was normalized based on the weight of the cell pellets (wet cell pellet mass) of the tested strains. This means, that for each enzyme expressed in cells with and without DsbA, strains were normalized to the same weight to volume ratio, and thus strains expressing the same enzyme were only compared to each other.

      Because enzyme degradation in the absence of DsbA is a key factor underlying the effects we describe for most of the tested β-lactamases (see Fig. 2A and S4A; no protein band is detected for 5 of the 7 enzymes in the dsbA mutant), it was not possible to normalize our samples based on enzyme levels detected by immunoblot. Normalization based on enzyme amounts would be feasible had we purified each β-lactamase after expression in the two different strain backgrounds (+/- dsbA) assuming sufficient protein amounts could be isolated from the dsbA mutant strain. Nonetheless, we feel that such a comparison would be misleading, since enzyme degradation likely plays the biggest role in the lack of activity observed for most of these enzymes in the absence of DsbA.

      (4) Not sure whether Fig 3 is very informative. Perhaps it could be redesigned to better encapsulate the findings in this manuscript (combine figurer 3 and 6 into one). I would also include the chemical structure of the inhibitors used and perhaps include how they block the system by binding to DsbB.

      Thank you for this comment. Fig. 3 was combined with Fig. 6 of the submitted manuscript. The new model figure is Fig. 5 in our revised manuscript.

      The inhibitor compound used in our study has been extensively characterized in a previous publication [21]. Considering that this inhibitor is not the main focus of our paper, we have avoided showing its chemical structure in any of the main display items. That said, its structure can be found in File S5 of our revised manuscript, which contains the quality control information on this compound. As suggested, we included the following sentence to describe the mode of action of this inhibitor: “Compound 36 was previously shown to inhibit disulfide bond formation in P. aeruginosa via covalently binding onto one of the four essential cysteine residues of DsbB in the DsbA-DsbB complex [21]” (lines 309-311 of the revised manuscript).

      (5) Figure 4: Similar to my comment above showing in the figure whether the differences observed in Figure 4, particularly A-C, are statistically significant (i.e. galleria survival difference in the presence and absence of dsbA) would be beneficial.

      As mentioned in our answer to comment 1 above, we have not performed statistical analyses for antibiotic MIC assays because, in line with recommended practice, our MIC results were not averaged (Fig. 3A,B,D,E of our revised manuscript). This should be avoided because of the quantized nature of MIC assays, which only inform on bacterial survival for specific antibiotic concentrations and do not provide information for antibiotic concentrations that lie in-between the tested values. Statistical analysis of G. mellonella survival data (Fig. 3C,F of our revised manuscript) was performed and is described fully in the legend of Fig. 3, as well as in the “Statistical analysis of experimental data” part of the “Materials and Methods” section of the paper (lines 729-738 of the revised manuscript). Finally, the statistical analyses for the most important comparisons in panels (C) and (F) of Fig. 3 are also marked directly on the figure.

      (6) Were the authors able to test the redox state of DsbA upon addition of the DsbB inhibitor to further demonstrate that the effects observed were indeed due to the obstruction of the Dsb machinery and not due to off target effects.

      Thank you for the opportunity to clarify this. In previous work from our lab, we have used a DSB system inhibitor termed “compound 12” in [22] with activity against DsbB proteins from Enterobacteria. In our previous study [23] we, indeed, tested the redox state of DsbA in the presence of this inhibitor compound. We could not perform the same experiment here with “compound 36” from [21], because we do not have an antibody against the DsbA protein of S. maltophilia. That said, we have carried out experiments that confirm that our results are due to specific inhibition of the DSB system and not because of off-target effects. In particular, we show that the gentamicin MIC values of S. maltophilia AMM remain unchanged in the presence of the inhibitor and treatment of S. maltophilia AMM dsbA dsbL with the compound does not affects its colistin MIC value (Fig. S2E and lines 317-320 of the revised manuscript).

      (7) Given the remarkable effects shown by the DsbB inhibitor, did the authors use this compound to assess whether inhibition of the Dsb system with small molecules would block cross-resistance in S. maltophilia - P. aeruginosa mixed communities (Fig 5D).

      Unfortunately, this was not possible. The decrease in the ceftazidime MIC value of S. maltophilia AMM in the presence of the DSB inhibitor compound is more modest than the effects we observed when the dsbA dsbL mutant is used (compare Fig. 4D (left) with Fig.4A of the revised manuscript). This means that in the presence of the DSB inhibitor there are still sufficient amounts of functional β-lactamase present and we expect that they would contribute to cross-protection of P. aeruginosa. While the use of the DSB inhibitor does have a drastic impact on the colistin resistance profile of S. maltophilia AMM (Fig. 4D of the revised manuscript), unlike β-lactamases, which act as common goods, MCR enzymes act solely on the lipopolysaccharide of their producer and do not contribute to bacterial interactions, precluding the use of colistin for a cross-protection experiment.

      Reviewer #2 (Recommendation For the Authors):

      (1) The acronym used for synthetic cystic fibrosis sputum medium (lines 523, 531, 535, 601, and 603) is defined in the manuscript as 'SCF', but the common formulation is 'SCFM', including in the provided citation. Suggest changing to SCFM for consistency.

      Thank you for this comment. This has been amended throughout our revised manuscript.

      (2) In Figure 1, while the legend states that "No changes in MIC values are observed for strains harboring the empty vector control (pDM1)[...]" (lines 729-30), the median of ceftazidime in the pDM1 control appears to indicate a 2-fold decrease in MIC. This would not seem to significantly impact the other results since the MIC decreases observed for other conditions are all 3-fold or greater, but this should be addressed and/or explained in the text.

      You are correct. Thank you for the opportunity to clarify this. Generally, since MIC assays have a degree of variability, we have only followed decreases in MIC values that are greater than 2fold. Generally, for most of our controls, the recorded MIC fold changes are below 2-fold. The only exception to this is the ceftazidime MIC drop of the empty-vector control, showing a 2fold change, which we do not consider significant.

      To ensure that this is clear in our text and figure legends the following changes were made:

      The clause “only differences larger than 2-fold were considered” was added to the text (lines 110-111 of the revised manuscript).

      We amended the legend of Fig. 1 accordingly: “No changes in MIC values are observed for the aminoglycoside antibiotic gentamicin (white bars) confirming that absence of DsbA does not compromise the general ability of this strain to resist antibiotic stress. Minor changes in MIC values (≤ 2-fold) are observed for strains harboring the empty vector control (pDM1) or those expressing the class A β-lactamases L2-1 and LUT-1, which contain two or more cysteines (Table S1), but no disulfide bonds (top row)”.

      (3) Similarly, in Fig S1E, there appears to be only partial complementation for BPS-1m. Do the authors hypothesize that this observation is related to a folding defect, rather than degradation of protein, as described for BPS-1m for Figure 2?

      Thank you for the opportunity to clarify this. You are correct that we only achieve partial complementation for the E. coli strain expressing the BPS-1m enzyme from the Burkholderia complex. Despite the fact that the gene for this enzyme was codon optimized, we observed that its expression in E. coli is sub-optimal and incurs fitness effects. In fact, to record the data presented in our manuscript the E. coli strains had to be transformed anew every time. Considering that the related enzyme BPS-6 does not present any of these challenges, we attribute the partial complementation to technical difficulties with the expression of the bps-1m gene in E. coli. 

      We clarified this by adding the following clause to our manuscript: “we only achieve partial complementation for the dsbA mutant expressing BPS-1m, which we attribute to the fact that expression of this enzyme in E. coli is sub-optimal” (lines 132-134 of the revised manuscript).

      (4) Lines 204-206: "[...]we deleted the principal dsbA gene, dsbA1 (pathogenic bacteria often encode multiple DsbA analogues [24,25]), in several multidrug-resistant (MDR) P. aeruginosa clinical strains (Table S2)". That multiple DsbA analogues are often encoded is good information to provide, but it was unclear from quickly looking at the citations whether Pa is counted among these. Is it expected that all oxidative protein folding in Pa functions through DsbA1? Conveying this information, if possible, may make the impact of the results in this model clearer.

      Thank you for this comment. To address it we added the following text to our manuscript:

      “To determine whether the effects on β-lactam MICs observed in our inducible system (Fig. 1 and [23]) can be reproduced in the presence of other resistance determinants in a natural context with endogenous enzyme expression levels, we deleted the principal dsbA gene, dsbA1, in several multidrug-resistant (MDR) P. aeruginosa clinical strains (Table S2). Pathogenic bacteria often encode multiple DsbA analogues [24,25] and P. aeruginosa is no exception. It encodes two DsbAs, but DsbA1 has been found to catalyze the vast majority of the oxidative protein folding reactions taking place in its cell envelope [26]” (lines 172-178 of the revised manuscript).

      (5) Regarding the clinical Pa isolates G4R7 and G6R7, have the authors performed any phenotypic testing on these strains to identify differences that might explain the substantial difference in piperacillin MIC? I.e., can these isolates be distinguished by growth rate, genetic markers or expression levels, early or late infection, mucoidy, etc. This is not essential for the current work, but could weigh on the efficacy of this treatment strategy for AIM1expressing clinical isolates. (E.g., the G4R7 dsbA1 strain exhibits a piperacillin MIC still ~2fold higher than WT G6R7).

      Thank you for the opportunity to clarify this. For clinical strains used in our study, we have evaluated their antibiotic resistance profiles, but we have not performed any additional phenotypic characterization. There are many reasons that contribute to differences in antibiotic resistance, starting simply from β-lactamase expression levels and extending to organismal effects, like the ones mentioned by the reviewer. Such characterization would fall outside the scope of our paper, especially since we sensitize our tested P. aeruginosa clinical isolates for the majority of the β-lactams antibiotics tested. 

      We acknowledged this by adding the following sentence to our revised manuscript: 

      “Despite the fact that P. aeruginosa G4R7 dsbA1 was not sensitized for piperacillintazobactam, possibly due to the high level of piperacillin-tazobactam resistance of the parent clinical strain, our results across these two isolates show promise for DsbA as a target against β-lactam resistance in P. aeruginosa” (lines 191-194 of the revised manuscript).

      (6) Lines 180-2: "This shows that without their disulfide bonds, these proteins are unstable and are ultimately degraded by other cell envelope proteostasis components [33]". While it is clear that protein is significantly lost in all cases except for BPS-1m in 2A, the dsbA pDM1bla constructs in 2B appear to all retain non-trivial (>10-fold) nitrocefin hydrolysis activity compared to the dsbA pDM1 control. This does not impact the other results in 2B, but it would seem that a loss-of-function folding defect, as described subsequently for BPS-1m, is also part of the explanation for the observed MIC decreases, and this was not necessarily clear from the quoted passage. This could simply be clarified in the final sentence - that both mechanisms are potentially in play - if the authors agree with that interpretation.

      You are correct, thank you for your comment. We amended the text in our revised manuscript as follows: 

      The data presented so far (Fig. 1 and 2) demonstrate that disulfide bond formation is essential for the biogenesis (stability and/or protein folding) and, in turn, activity of an expanded set of clinically important β-lactamases, including enzymes that currently lack inhibitor options” (lines 158-161 of the revised manuscript).

      (7) While it is clear from Figure S2 that the various dsb mutants do not have a general growth defect or collateral sensitivity to another antibiotic, it does not appear that there is an analogous control for the DSB inhibitor demonstrating no growth/toxic effects at the concentration used. This could be provided similarly to Figure S2, using gentamicin as a control antibiotic.

      We have carried out experiments that confirm that our results are due to specific inhibition of the DSB system and not because of off-target effects. In particular, we show that the gentamicin MIC values of S. maltophilia AMM remain unchanged in the presence of the inhibitor and treatment of S. maltophilia AMM dsbA dsbL with the compound does not affects its colistin MIC value (Fig. S2E and lines 317-320 of the revised manuscript).

      (8) Complementation is appropriately provided for experiments with E. coli, but are not provided for P. aeruginosa or S. maltophilia. It should be straightforward to complement in Pa, but is also probably less critical considering the evidence from E. coli. However, since the Sm mutant is a gene cluster with two genes, it would seem more imperative to complement this strain. This reviewer is not familiar enough with Sm to know if complementation is routine or feasible with this organism; if not, the controls for the DSB inhibitor should at least be provided.

      As mentioned in our response to comment 7 above, we have carried out experiments that confirm that our DSB inhibitor results are due to specific inhibition of the DSB system and not because of off-target effects.

      Moreover, in response to this comment, we have further demonstrated that our results are due to the specific interaction of DsbA with β-lactamase enzymes by complementing dsbA deletions in representative clinical strains of multidrug-resistant Pseudomonas aeruginosa and extremely-drug-resistant Stenotrophomonas maltophilia. We would like to note here that gene complementation in clinical isolates remains very rare in the literature due to their high levels of resistance and limited genetic tractability. Most of the few complementation examples reported for these two organisms are limited to strains that, although pathogenic, are commonly used in the lab, or to complementation efforts in non-clinical strain systems (for example use of P. aeruginosa PA14 for complementation, instead of the focal clinical isolate).

      We tested three different complementation strategies, two of which ended up being unsuccessful. After approximately 9 months of work, we succeeded in complementing a representative clinical strain for each organism (P. aeruginosa CDC #769 dsbA1 and S. maltophilia AMM dsbA dsbL) by inserting the dsbA1 gene from P. aeruginosa PAO1 into the Tn7 site on the chromosome. Both clinical strains show full complementation for every antibiotic tested; our complementation results can be found in Fig. S2B,D of the revised manuscript.

      The following text was added for P. aeruginosa clinical isolates:

      We have demonstrated the specific interaction of DsbA with the tested β-lactamase enzymes in our E. coli K-12 inducible system using gentamicin controls (Fig. 1 and File S2A) and gene complementation (Fig. S1). To confirm the specificity of this interaction in P. aeruginosa, we performed representative control experiments in one of our clinical strains, P. aeruginosa CDC #769. We first tested the general ability of P. aeruginosa CDC #769 dsbA1 to resist antibiotic stress by recording MIC values against gentamicin, and found it unchanged compared to its parent (Fig. S2A). Gene complementation in clinical isolates is especially challenging and rarely attempted due to the high levels of resistance and lack of genetic tractability in these strains. Despite these challenges, to further ensure the specificity of the interaction of DsbA with tested β-lactamases in P. aeruginosa, we have complemented dsbA1 from P. aeruginosa PAO1 into P. aeruginosa CDC #769 dsbA1. We found that complementation of dsbA1 restores MICs to wild-type values for both tested β-lactam compounds (Fig. S2B) further demonstrating that our results in P. aeruginosa clinical strains are not confounded by off-target effects” (lines 226-239 of the revised manuscript).

      The following text was added for S. maltophilia clinical isolates: 

      “Since the dsbA and dsbL are organized in a gene cluster in S. maltophilia, we wanted to ensure that our results reported above were exclusively due to disruption of disulfide bond formation in this organism. First, we recorded gentamicin MIC values for S. maltophilia AMM dsbA dsbL and found them to be unchanged compared to the gentamicin MICs of the parent strain (Fig. S2C). This confirms that disruption of disulfide bond formation does not compromise the general ability of this organism to resist antibiotic stress. Next, we complemented S. maltophilia AMM dsbA dsbL. The specific oxidative roles and exact regulation of DsbA and DsbL in S. maltophilia remain unknown. For this reason and considering that genetic manipulation of extremely-drug-resistant organisms is challenging, we used our genetic construct optimized for complementing P. aeruginosa CDC #769 dsbA1 with dsbA1 from P. aeruginosa PAO1 (Fig. S2B) to also complement S. maltophilia AMM dsbA dsbL. We based this approach on the fact that DsbA proteins from one species have been commonly shown to be functional in other species [27-30]. Indeed, we found that complementation of S. maltophilia AMM dsbA dsbL with P. aeruginosa PAO1 dsbA1 restores MICs to wild-type values for both ceftazidime and colistin (Fig. S2D), conclusively demonstrating that our results in S. maltophilia are not confounded by off-target effects” (lines 282-297 of the revised manuscript).

      (9) In Figure 5E, the growth inhibition and loss of Pa CFU in 4 ug/mL ceftazidime for the Sm co-culture condition, which is subsequently lost in the Sm dsbA dsbL co-culture, does not appear to be discussed. As Pa is shown to grow fine in monoculture at this concentration, this result should be discussed in relation to the co-culture dynamics. Is it expected or observed that WT Sm is out-competing Pa under this condition and growing to a high CFU/mL? This would seem to have parallels to citation 49.

      As requested by this reviewer (see comment 10 below), we simultaneously tracked the abundance of P. aeruginosa and S. maltophilia strains in our cross-protection experiment. During this process we probed the abundances of the two organisms at 4 µg/mL of ceftazidime. Our results can be seen in Fig. S3B of the revised manuscript. The reviewer is correct and these effects are due to competition between P. aeruginosa and S. maltophilia with the latter being able to reach very high CFUs in this antibiotic concentration. 

      The following text on co-culture dynamics was added to our revised manuscript: 

      At low antibiotic concentrations, for example 4 μg/mL of ceftazidime, S. maltophilia AMM is fully resistant and thrives, thus outcompeting P. aeruginosa PA14 (dark pink and dark blue bars in Fig. S3B). The same can also be seen in Fig. 4E, whereby decreased P. aeruginosa PA14 CFUs are recorded. By contrast S. maltophilia AMM dsbA dsbL already displays decreased growth at 4 μg/mL of ceftazidime because of its non-functional L1-1 enzyme, allowing comparatively higher growth of P. aeruginosa (light pink and light blue bars in Fig. S3B)” (lines 384-390 of the revised manuscript).

      (10) The data presented in Figure 5E would be augmented by the inclusion of, for at least a few representative cases, the Sm CFUs relative to the Pa CFUs. In describing the protective effects of Sm on Pa for imipenem treatment, the authors of citation 12 note that the effect was dependent on Sm cell density. This raises the immediate question of whether the protection observed in this work is similarly dependent on cell density of Sm. It is unclear if the authors expect Sm to persist under these conditions, and it seems Sm CFU should be expected to be relatively high considering it is pre-incubated for 6 hours prior to the assay. What is the physiological state of these cells, and how are they affected by ceftazidime? While many other variables are likely relevant to the translation of this protection, the relative abundance and localization of Sm and Pa commonly observed in CF patients, as well as the effective concentration of antibiotic observed in vivo, is likely worth consideration.

      As mentioned in our response to comment 9 above, we have simultaneously tracked the abundance of P. aeruginosa and S. maltophilia strains in our cross-protection experiment for select antibiotic concentrations. To be able to perform this experiment, we had to label two extremely-drug-resistant strains of S. maltophilia with an antibiotic resistance marker that allowed us to quantify them in mixtures with P. aeruginosa. Our results can be found in Fig. S3 of our revised manuscript and, in a nutshell, show that ceftazidime treatment leads to eradication of both P. aeruginosa and S. maltophilia when disulfide bond formation is impaired in S. maltophilia.

      The following text was added to address the questions of the reviewer:

      “Due to the naturally different growth rates of these two species (S. maltophilia grows much slower than P. aeruginosa) especially in laboratory conditions, the protocol we followed [1] requires S. maltophilia to be grown for 6 hours prior to co-culturing it with P. aeruginosa. To ensure that at this point in the experiment our two S. maltophilia strains, with and without dsbA, had grown comparatively to each other, we determined their cell densities (Fig. S3A). We found that S. maltophilia AMM dsbA dsbL had grown at a similar level as the wild-type strain, and both were at a higher cell density [~10<sup>7</sup> colony forming units (CFUs)] compared to the P.aeruginosa PA14 inoculum (5 x 10<sup>4</sup> CFUs)” (lines 353-361 of the revised manuscript).

      “To ensure that ceftazidime treatment leads to eradication of both P. aeruginosa and S. maltophilia when disulfide bond formation is impaired in S. maltophilia, we monitored the abundance of both strains in each synthetic community for select antibiotic concentrations (Fig. S3B). In this experiment we largely observed the same trends as in Fig. 4E. At low antibiotic concentrations, for example 4 μg/mL of ceftazidime, S. maltophilia AMM is fully resistant and thrives, thus outcompeting P. aeruginosa PA14 (dark pink and dark blue bars in Fig. S3B). The same can also be seen in Fig. 4E, whereby decreased P. aeruginosa PA14 CFUs are recorded. By contrast S. maltophilia AMM dsbA dsbL already displays decreased growth at 4 μg/mL of ceftazidime because of its non-functional L1-1 enzyme, allowing comparatively higher growth of P. aeruginosa (light pink and light blue bars in Fig. S3B). Despite the competition between the two strains, P. aeruginosa PA14 benefits from S. maltophilia AMM’s high hydrolytic activity against ceftazidime, which allows it to survive and grow in high antibiotic concentrations even though it is not resistant (see 128 μg/mL; dark pink and dark blue bars in Fig. S3B). In stark opposition, without its disulfide bond in S. maltophilia AMM dsbA dsbL, L1-1 cannot confer resistance to ceftazidime, resulting in killing of S. maltophilia AMM dsbA dsbL and, consequently, also of P. aeruginosa PA14 (see 128 μg/mL; light pink and light blue bars in Fig. S3B).

      The data presented here show that, at least under laboratory conditions, targeting protein homeostasis pathways in specific recalcitrant pathogens has the potential to not only alter their own antibiotic resistance profiles (Fig. 3 and 4A-D), but also to influence the antibiotic susceptibility profiles of other bacteria that co-occur in the same conditions (Fig. 5). Admittedly, the conditions in a living host are too complex to draw direct conclusions from this experiment. That said, our results show promise for infections, where pathogen interactions affect treatment outcomes, and whereby their inhibition might facilitate treatment” (lines 381406 of the revised manuscript).

      (11) Regarding the role of microbial interactions in CF and other disease/infection contexts, the authors should temper their descriptions in accordance with citations provided. As an example, lines 96-99: "For example, in the CF lung, highly drug-resistant S. maltophilia strains actively protect susceptible P. aeruginosa from β-lactam antibiotics [12], and ultimately facilitate the evolution of β-lactam resistance in P. aeruginosa [14]."

      Neither citation provided here attests to Sm protection of Pa "in the CF lung". Both papers use a simplified in vitro co-culture model to assess Sm protection of Pa from antibiotics and the evolution of Pa antibiotic resistance in the presence or absence of Sm, respectively. In the latter case, it should also be noted that while the authors observed somewhat faster Pa resistance evolution in one co-culture condition, they did not observe it in the other, and that resistance evolution in general was observed regardless of co-culture condition. There are also statements in the ultimate and penultimate paragraphs of the Discussion section that repeat these points. The authors could re-frame this aspect of their investigation as part of a working hypothesis related to potential interactions of these pathogens, and should appropriately caveat what is and is not known from in vitro and in vivo/clinical work.

      Thank you for your comment. You are entirely correct. We have amended the test throughout our revised manuscript to avoid overstating these finding and to be clear about the fact that they originate from experimental studies. Please find below representative examples of such passages:

      “In particular, some antibiotic resistance proteins, like β-lactamases, which decrease the quantities of active drug present, function akin to common goods, since their benefits are not limited to the pathogen that produces them but can be shared with the rest of the bacterial community. This means that their activity enables pathogen cross-resistance when multiple species are present [1,31], something that was demonstrated in recent work investigating the interactions between pathogens that naturally co-exist in CF infections. More specifically, it was shown that in laboratory co-culture conditions, highly drug-resistant S. maltophilia strains actively protect susceptible P. aeruginosa from β-lactam antibiotics [1]. Moreover, this crossprotection was found to facilitate, at least under specific conditions, the evolution of β-lactam resistance in P. aeruginosa [32]” (lines 47-57 of the revised manuscript).

      “The antibiotic resistance mechanisms of S. maltophilia impact the antibiotic tolerance profiles of other organisms that are found in the same infection environment. S. maltophilia hydrolyses all β-lactam drugs through the action of its L1 and L2 β-lactamases [7,8]. In doing so, it has been experimentally shown to protect other pathogens that are, in principle, susceptible to treatment, such as P. aeruginosa [1]. This protection, in turn, allows active growth of otherwise treatable P. aeruginosa in the presence of complex β-lactams, like imipenem [1], and, at least in some conditions, increases the rate of resistance evolution of P. aeruginosa against these antibiotics [32]” (lines 332-340 of the revised manuscript).

      (12) Regarding the role of S. maltophilia in CF disease, the authors should either discuss clinical associations more completely or note the conflicting data on its role in disease. As an example, lines 84-87: "As a result, the standard treatment option, i.e., broad-spectrum βlactam antibiotic therapy, constitutes a severe risk for CF patients carrying both P. aeruginosa and S. maltophilia [10,11], creating an urgent need for antimicrobial approaches that will be effective in eliminating both pathogens."

      It is unclear how this treatment results in a "severe risk" for CF patients colonized by both Sm and Pa. Citation 10 suggests an association between anti-pseudomonal antibiotic use and increased prevalence of Sm, but neither citation supports a worsening clinical outcome from this treatment. Citation 10 further notes that clinical scores between Sm-positive and control cohorts could not be distinguished statistically. Citation 11 is a review that makes note of this conflicting data regarding Sm, including reference to a more recent (at the time) result using multivariate analysis showing no independent affect of Sm on survival.

      The above point similarly applies to other statements in the manuscript, for example at lines 266-267: "Considering the contribution of S. maltophilia strains to treatment failure in CF lung infections [8,10,11][...]" As well as lines 79-80: "Pulmonary exacerbations and severe disease states are also associated with the presence of S. maltophilia [8]"

      Again, the provided citations do not support the implication that Sm specifically 'contributes to treatment failure in CF lung infections' or that Sm is specifically associated with severe disease states. In addition to the previously discussed citations, citation 8 describes broad "pulmotypes" composed of 10 species/genera that could be associated with particular clinical (e.g., exacerbation) or treatment (e.g., antibiotic therapy) characteristics, but these cannot, without further analysis, be associated with, or causally linked to, a specific pathogen. While pulmotype 2 in citation 8 was associated with a more severe clinical state and appeared to have the highest relative abundance of Sm compared to other pulmotypes, Sm was not identified (Figure 4A) as an independent factor that distinguishes between moderate and severe disease, unlike Pa and some anaerobes (4F-H). The authors also observed that decreasing relative abundance of Pa, in particuar, is correlated with subsequent exacerbation, but did not correlate this with the presence of any other species or genera. Again, this should be re-framed with the appropriate caveat that this is a hypothesis with possible clinical significance.

      Several suggested papers are included below on Sm association with clinical characteristics to incorporate into the manuscript if the authors choose to do so:

      https://doi.org/10.1177/14782715221088909

      https://doi.org/10.1016/j.prrv.2010.07.003

      https://doi.org/10.1016/j.jcf.2013.05.009 https://doi.org/10.1002/ppul.23943

      https://doi.org/10.1002/14651858.CD005405.pub2

      https://doi.org/10.1164/rccm.2109078 http://dx.doi.org/10.1136/thx.2003.017707

      https://erj.ersjournals.com/content/23/1/98.short

      Thank you for your comment. You are entirely correct. We have amended the test throughout our revised manuscript to avoid overstating the role of S. maltophilia in CF infections and to reference additional relevant works in the literature. Please find below representative examples of such passages:

      “On the other hand, CF microbiomes are increasingly found to encompass S. maltophilia [2-4], a globally distributed opportunistic pathogen that causes serious nosocomial respiratory and bloodstream infections [5-7]. S. maltophilia is one of the most prevalent emerging pathogens [6] and it is intrinsically resistant to almost all antibiotics, including β-lactams like penicillins, cephalosporins and carbapenems, as well as macrolides, fluoroquinolones, aminoglycosides, chloramphenicol, tetracyclines and colistin. As a result, the standard treatment option for lung infections, i.e., broad-spectrum β-lactam antibiotic therapy, is rarely successful in countering S. maltophilia [7,8], creating a definitive need for approaches that will be effective in eliminating both pathogens” (lines 33-41 of the revised manuscript).

      “Of the organisms studied in this work, S. maltophilia deserves further discussion because of its unique intrinsic resistance profile. The prognosis of CF patients with S. maltophilia lung carriage is still debated [4,9-16], largely because studies with extensive and well-controlled patient cohorts are lacking. This notwithstanding, the therapeutic options against this pathogen are currently limited to one non-β-lactam antibiotic-adjuvant combination, , which is not always effective, trimethoprim-sulfamethoxazole [17-20], and a few last-line β-lactam drugs, like the fifth-generation cephalosporin cefiderocol and the combination aztreonam-avibactam. Resistance to commonly used antibiotics causes many problems during treatment and, as a result, infections that harbor S. maltophilia have high case fatality rates [7]. This is not limited to CF patients, as S. maltophilia is a major cause of death in children with bacteremia [5]” (lines 440-450 of the revised manuscript).

      Reviewer #3 (Recommendation For the Authors):

      (1) The referencing of supplemental figures does not follow a sequential order. For example, Figure S2 appears in the text before S1. The sequential ordering of figure numbers improves the readability and can be considered while editing the manuscript for revision.

      Thank you for this comment. This is amended in our revised manuscript and supplemental figures and files are cited in order.

      (2 )It will be useful to provide a brief description of ambler classes since these are important to study design (for a broader audience).

      Thank you for this suggestion. This has been added and can be found in lines 91-101 of the revised manuscript.

      (3) The rationale for using K12 strain for E. coli should be provided. It appears that is a model system that is well established in their lab, but a scientific rationale can be listed. Maybe this strain does not have any lactamases in its genome other than the one being expressed as compared to pathogenic E. coli?

      Thank you for this suggestion. This has been added and can be found in lines 104-106 of the revised manuscript.

      (4) The reviewers used worm model to test their observations, which is relevant. Given the significant implications of their work in overcoming resistance to clinically used antibiotics and availability of already generated dsbA mutants in clinical strains, it will be useful to investigate survival in animal models or at least wound models of Pseudomonas infections. The reviewer does not deem this necessary, but it will significantly increase the impact of their seminal work.

      Thank you for this comment. We appreciate the sentiment, and we would have liked to be able to perform experiments in a murine model of infection. There are several reasons that made this not possible, and as a result we used G. mellonella as an informative preliminary in vivo infection model. The DSB proteins have been shown to play a central role in bacterial virulence. Because of this our P. aeruginosa and S. maltophilia mutant strains are not efficient in establishing an infection, even in a wound model. This could be overcome had we been able to use the chemical inhibitor of the DSB system in vivo, however this also is not possible This is due to the fact that the chemical compound that we use to inhibit the function of DsbA acts on DsbB. Inhibition of DsbB blocks the re-oxidation of DsbA and leads to its accumulation in its inactive reduced form. However, the action of the inhibitor can be bypassed through reoxidation and re-activation of DsbA by small-molecule oxidants such as L-cystine, which are abundant in rich growth media or animal tissues. This makes the inhibitor only suitable for in vitro assays that can be performed in minimal media, where the presence of small-molecule oxidants can be strictly avoided, but entirely unsuitable for an insect or a vertebrate animal model.

    1. eLife Assessment

      This work represents an important contribution to our understanding of how membrane energetics influence protein conformation and function in mechano-sensitive channels. Through extensive molecular dynamics simulations and energetic analysis, the study convincingly demonstrates how the channel structure is shaped by a balance of protein and membrane-induced forces, effectively reconciling experimental data from different membrane environments. This work will appeal broadly to researchers and readers with interests in ion channel structure and function, mechanosensation, and membrane biophysics.

    2. Reviewer #1 (Public review):

      Dixit, Noe, and Weikl apply coarse-grained and all-atom molecular dynamics to determine the response of the mechanosensitive proteins Piezo 1 and Piezo 2 proteins to tension. Cryo-EM structures in micelles show a high curvature of the protein whereas structures in lipid bilayers show lower curvature. Is the zero-stress state of the protein closer to the micelle structure or the bilayer structure? Moreover, while the tension sensitivity of channel function can be inferred from experiment, molecular details are not clearly available. How much does the protein's height and effective area change in response to tension? With these in hand, a quantitative model of its function follows that can be related to the properties of the membrane and the effect of external forces.

      Simulations indicate that in a bilayer the protein relaxes from the highly curved cryo-EM dome (Figure 1).

      Under applied tension the dome flattens (Figure 2) including the underlying lipid bilayer. The shape of the system is a combination of the membrane mechanical and protein conformational energies (Eq. 1). The membrane mechanical energy is well-characterized. It requires only the curvature and bending modulus as inputs. They determine membrane curvature and the local area metric (Eq. 4) by averaging the height on a grid and computing second derivatives (Eqs. 7, 8) consistent with known differential geometric formulas.

      While I am still critical generally of a precise estimate of the energy from simulated membrane shapes (after all it is not trivial to precisely determine even the bending modulus from a simulation), I believe with their revision the authors have convinced me that their estimate is a high quality one, without obvious issues. Although there appears to have been a miscommunication about increasing the density of grain or lowering the density of grain, the authors have tried two grains and determined a similar deformation energy, which addresses my concern. Furthermore, they have computed a dramatically reduced simplification of the curve and determined a similar value.

      In summary, this paper uses molecular dynamics simulations to quantify the force of the Piezo 1 and Piezo 2 proteins on a lipid bilayer using simulations under controlled tension, observing the membrane deformation, and using that data to infer protein mechanics. While much of the physical mechanism was previously known, the study itself is a valuable quantification.

    3. Reviewer #2 (Public review):

      Summary:

      In this study the authors suggest that the structure of Piezo2 in a tensionless simulation is flatter compared to the electron microscopy structure. This is an interesting observation and highlights the fact that the membrane environment is important for Piezo2 curvature. Additionally, the authors calculate the excess area of Piezo2 and Piezo1, suggesting that it is significantly smaller compared the area calculated using the EM structure or simulations with restrained Piezo2. Finally, the authors propose an elastic model for Piezo proteins. Those are very important findings, which would be of interest to the mechanobiology field.

      Whilst I like the suggestion that the membrane environment will change Piezo2 flatness, could this be happening because of the lower resolution of the MARTINI simulations? In other words, would it be possible that MARTINI is not able to model such curvature due to its lower resolution?

      Related to my comment above, the authors say that they only restrained the secondary structure using an elastic network model. Whilst I understand why they did this, Piezo proteins are relatively large. How can the authors know that this type of elastic network model restrains, combined with the fact that MARTINI simulations are perhaps not very accurate in predicting protein conformations, can accurately represent the changes that happen within Piezo channel during membrane tension?

      Modelling or Piezo1, seems to be based on homology to Piezo2. However, the authors need to further evaluate their model, e.g. how it compares with an Alphafold model.

      To calculate the tension induce flattening of Piezo channel, the authors "divide all simulation trajectories into 5 equal intervals and determine the nanodome shape in each interval by averaging over the conformations of all independent simulation runs in this interval.". However, probably the change in the flattening of Piezo channel happens very quickly during the simulations, possibly within the same interval. Is this the case? and if yes does this affects their calculations?

      Finally, the authors use a specific lipid composition, which is asymmetric. Is it possible that the asymmetry of the membrane causes some of the changes in the curvature that they observe? Perhaps more controls, e.g. with a symmetric POPC bilayer is needed to identify whether membrane asymmetry plays a role in the membrane curvature they observe.

    4. Reviewer #3 (Public review):

      Strengths:

      This work focuses on a problem of deep significance: quantifying the structure-tension relationship and underlying mechanism for the mechanosensitive Piezo 1 and 2 channels. Such an objective is challenging for molecular dynamics simulations, due to the relatively large size of each membrane-protein system. Nonetheless, the approach chosen here is based on methodology that is, in principle, established and widely accessible. Therefore, another group of practitioners would likely be able to reproduce these findings with reasonable effort.

      More specifically, while acknowledging the limitations of the MARTINI force field, this work makes a significant improvement compared to previous simulations of Piezo proteins by adopting a range of membrane tensions that includes physiologically relevant values (below 10 mN/m).

      Weaknesses:

      The two main results of this paper are (1) that both channels exhibit a flatter structure compared to cryo-EM measurements, and (2) their estimated force vs. displacement relationship. Although the former correlates at least quantitatively with prior experimental work, the latter relies exclusively on simulation results and model parameters.

      My remaining technical concerns in the revised manuscript are as follows:

      (1) At each membrane tension, all concurrent atomistic simulations were initialized from the same snapshot of a previous CG simulation: in my opinion, it is inaccurate to refer to those atomistic simulations as "independent" from each other (as is done twice in the caption of Figure 3, as well as in the text).

      (2) Continuum mechanics calculations were employed to model the membrane's curvature energetics. The bending modulus, k, was not determined for the specific lipid composition used in this study, but was instead taken from previous MARTINI simulations involving the same primary lipid, POPC. Given that these calculations are intended to describe MARTINI simulations specifically, this approximation may be acceptable. However, it does not account for the increased stiffness observed in POPC/cholesterol mixtures-an effect measured experimentally but not reproduced by the MARTINI model-nor does it reflect the asymmetric conditions, as all referenced simulations involve symmetric bilayers. As a result, the bending energies and forces shown in Figure 5(c,d) are internally consistent within the model, but they probably correspond to real values up to an unknown multiplicative factor.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review): 

      Dixit, Noe, and Weikl apply coarse-grained and all-atom molecular dynamics to determine the response of the mechanosensitive proteins Piezo 1 and Piezo 2 proteins to tension. Cryo-EM structures in micelles show a high curvature of the protein whereas structures in lipid bilayers show lower curvature. Is the zero-stress state of the protein closer to the micelle structure or the bilayer structure? Moreover, while the tension sensitivity of channel function can be inferred from the experiment, molecular details are not clearly available. How much does the protein's height and effective area change in response to tension? With these in hand, a quantitative model of its function follows that can be related to the properties of the membrane and the effect of external forces. 

      Simulations indicate that in a bilayer the protein relaxes from the highly curved cryo-EM dome (Figure 1). 

      Under applied tension, the dome flattens (Figure 2) including the underlying lipid bilayer. The shape of the system is a combination of the membrane mechanical and protein conformational energies (Equation 1). The membrane's mechanical energy is well-characterized. It requires only the curvature and bending modulus as inputs. They determine membrane curvature and the local area metric (Equation 4) by averaging the height on a grid and computing second derivatives (Equations 7, 8) consistent with known differential geometric formulas. 

      The bending energy can be limited to the nano dome but this implies that the noise in the membrane energy is significant. Where there is noise outside the dome there is noise inside the dome. At the least, they could characterize the noisy energy due to inadequate averaging of membrane shape. 

      My concern for this paper is that they are significantly overestimating the membrane deformation energy based on their numerical scheme, which in turn leads to a much stiffer model of the protein itself.

      We agree that “thermal noise” is intrinsic to MD simulations, as in “real” systems, leading to thermally excited shape fluctuations of membranes and conformational fluctuations of proteins. However, for our coarse-grained simulations, the thermally excited membrane shape fluctuations can be averaged out quite well, and the resulting average shapes are smooth, see e.g. the shapes and lines of the contour plots in Fig. 1 and 2. For our atomistic simulations, the averaged shapes are not as smooth, see Fig. 3a and the lines of the contour plots in Fig. 3b. Therefore, we do not report bending energies for the nanodome shapes determined from atomistic simulations, because bending energy calculations are sensitive to remaining “noise” on small scales (due to the scale invariance of the bending energy), in contrast to calculations of excess areas, which we state now on lines 620ff.

      For our coarse-grained simulations, we now corroborate our bending energy calculations based on averaged 3d shapes by comparing to bending energy values obtained from highly smoothened 2d mean curvature profiles (see Fig. 1c for mean curvature profiles in tensionless membranes). We discuss this in detail from line 323 on, starting with:

      “To corroborate our bending energy calculations for these averaged three-dimensional nanodome shapes, we note that essentially identical bending energies can be obtained from the highly smoothened mean curvatures M of the two-dimensional membrane profiles. …”

      Two things would address this: 

      (1) Report the membrane energy under different graining schemes (e.g., report schemes up to double the discretization grain). 

      There are two graining schemes in the modeling, and we have followed the reviewer’s recommendation regarding the second scheme. In the first, more central graining scheme, we use quadratic membrane patches with a sidelength of about 2 nm to determine membrane midplane shapes and lipid densities of each simulation conformation. This graining scheme has also been previously employed in Hu, Lipowsky, Weikl, PNAS 38, 15283 (2013) to determine the shape and thermal roughness of coarse-grained membranes. A sidelength of 2 nm is necessary to have sufficiently many lipid headgroups in the upper and lower leaflet in the membrane patches for estimating the local height of these leaflets, and the local membrane midplane height as average of these leaflet heights (see subsection “Membrane shape of simulation conformation” in the Methods section for details).  However, we strongly believe that doubling the sidelength of membrane patches in this discretization is not an option, because a discretization length of 4 nm is too coarse to resolve the membrane deformations in the nanodome, see e.g. the profiles in Fig. 1b. Moreover, any “noise” from this discretization is rather completely smoothened out in the averaging process used in the analysis of the membrane shapes, at least for the coarse-grained simulations. This averaging process requires rotations of membrane conformations to align the protein orientations of the conformations (see subsection “Average membrane shapes and lipid densities” for details). Because of these rotations, the original discretization is “lost” in the averaging, and a continuous membrane shape is generated. To calculate the excess areas and bending energies for this smooth, continuous membrane shape, we use a discretization of the Monge plane into a square lattice with lattice parameter 1 nm. As a response to the referee’s suggestion, we now report that the results for the excess area do not change significantly when doubling this lattice parameter to 2 nm. On line 597, we write:

      “For a lattice constant of a=2 nm, we obtain extrapolated values of the excess area Delta A from the coarse-grained simulations that are 2 to 3% lower than the values for a=1 nm, which is a small compared to statistical uncertainties with relative errors of around 10%.”

      On lines 614ff, we now state that the bending energy results are about 10% to 13% lower for a=2 nm, likely because of the lower resolution of the curvature in the nanodome compared to a=1 nm, rather than incomplete averaging and remaining roughness of the coarse-grained nanodome shapes.

      (2) For a Gaussian bump with sigma=6 nm I obtained a bending energy of 0.6 kappa, so certainly in the ballpark with what they are reporting but significantly lower (compared to 2 kappa, Figure 5 lower left). It would be simpler to use the Gaussian approximation to their curves in Figure 3 - and I would argue more accurate, especially since they have not reported the variation of the membrane energy with respect to the discretization size and so I cannot judge the dependence of the energy on discretization. I view reporting the variation of the membrane energy with respect to discretization as being essential for the analysis if their goal is to provide a quantitative estimate for the force of Piezo. The Helfrich energy computed from an analytical model with a membrane shape closely resembling the simulated shapes would be very helpful. According to my intuition, finite-difference estimates of curvatures will tend to be overestimates of the true membrane deformation energy because white noise tends to lead to high curvature at short-length scales, which is strongly penalized by the bending energy. 

      Instead of Gaussian bumps, we now calculate the membrane bending energy also from the two-dimensional, continuous mean curvature profiles (see Fig. 1c). These mean curvature profiles are highly smoothened (see figure caption for details). Nonetheless, we obtain essentially the same bending energies as in our discrete calculations of averaged, smoothened threedimensional membrane shapes, see new text on lines 326ff. We believe that this agreement corroborates our bending energy calculations. We still focus on values obtained for threedimensional membrane shapes, because of incomplete rotational symmetry. The three-dimensional membrane shapes exhibit variations with the three-fold symmetry of the Piezo proteins, see Figure 2a and b.

      We agree that the bending energy of thermally rough membranes depends on the discretization scheme, because the discretization length of any discretization scheme leads to a cut-off length for fluctuation modes in a Fourier analysis. But again, we average out the thermal noise, for reasons given in the Results section, and analyse smooth membrane shapes.  

      The fitting of the system deformation to the inverse time appears to be incredibly ad hoc ... Nor is it clear that the quantified model will be substantially changed without extrapolation. The authors should either justify the extrapolation more clearly (sorry if I missed it!) or also report the unextrapolated numbers alongside the extrapolated ones. 

      We report the values of the excess area and bending energy in the different time intervals of our analysis as data points in Fig. 4 with supplement. We find it important to report the time dependence of these quantities, because the intended equilibration of the membrane shapes in our simulations is not “complete” within a certain time window of the simulations. So, just “cutting” the first 20 and 50% of the simulation trajectories, and analysing the remaining parts as “equilibrated” does not seem to be a reasonable choice here, at least for the membrane properties, i.e. for the excess area and bending energy. We agree that the linear extrapolation used in our analysis is a matter of choice. At least for the coarse-grained simulations, the extrapolated values of excess areas and bending energies are rather close to the values obtained in the last time windows (see Figure 4). 

      In summary, this paper uses molecular dynamics simulations to quantify the force of the Piezo 1 and Piezo 2 proteins on a lipid bilayer using simulations under controlled tension, observing the membrane deformation, and using that data to infer protein mechanics. While much of the physical mechanism was previously known, the study itself is a valuable quantification. I identified one issue in the membrane deformation energy analysis that has large quantitative repercussions for the extracted model. 

      Reviewer #2 (Public review): 

      Summary: 

      In this study, the authors suggest that the structure of Piezo2 in a tensionless simulation is flatter compared to the electron microscopy structure. This is an interesting observation and highlights the fact that the membrane environment is important for Piezo2 curvature. Additionally, the authors calculate the excess area of Piezo2 and Piezo1, suggesting that it is significantly smaller compared to the area calculated using the EM structure or simulations with restrained Piezo2. Finally, the authors propose an elastic model for Piezo proteins. Those are very important findings, which would be of interest to the mechanobiology field. 

      Whilst I like the suggestion that the membrane environment will change Piezo2 flatness, could this be happening because of the lower resolution of the MARTINI simulations? In other words, would it be possible that MARTINI is not able to model such curvature due to its lower resolution? 

      Related to my comment above, the authors say that they only restrained the secondary structure using an elastic network model. Whilst I understand why they did this, Piezo proteins are relatively large. How can the authors know that this type of elastic network model restrains, combined with the fact that MARTINI simulations are perhaps not very accurate in predicting protein conformations, can accurately represent the changes that happen within the Piezo channel during membrane tension? 

      These questions regarding the reliability of the Martini model are very reasonable and are the reason why we include also results from atomistic simulations, at least for Piezo 2, and compare the results. In the Martini model, secondary structure constraints are standard. In addition, constraints on the tertiary structure (e.g. via an elastic network model) are also typically used in simulations of soluble, globular proteins. However, such tertiary constraints would make it impossible to simulate the tension-induced flattening of the Piezo proteins. So instead, as we write on lines 427ff, “we relied on the capabilities of the Martini coarse-grained force field for modeling membrane systems with TM helix assemblies (Sharma and Juffer, 2013; Chavent et al., 2014; Majumder and Straub, 2021).” In these refences, Martini simulations were used to study the assembly of transmembrane helices, leading to agreement with experimentally observed structures. As we state in our article, our atomistic simulations corroborate the Martini simulations, with the caveats that are now more extensively discussed in the new last paragraph of the Discussion section starting on line 362.

      Modelling or Piezo1, seems to be based on homology to Piezo2. However, the authors need to further evaluate their model, e.g. how it compares with an Alphafold model. 

      We understand the question, but see it beyond the scope of our article, also because of the computational demand of the simulations. The question is: Do coarse-grained simulations of Piezo1 based on an Alphafold model as starting structure lead to different results? It is important to note that we only model the rather flexible 12 TM helices at the outer ends of the Piezo 1 monomers via homology modeling to the Piezo 2 structure, which includes these TM helices. For the inner 26 TM helices, including the channel, we use the high-quality cryo-EM structure of Piezo 1. Alphafold may be an alternative for modeling the outer 12 helices, but we don’t think this would lead to statistically significant differences in simulations – e.g. because of the observed overall agreement of membrane shapes in all our Piezo 1 and Piezo 2 simulation systems.

      To calculate the tension-induced flattening of the Piezo channel, the authors "divide all simulation trajectories into 5 equal intervals and determine the nanodome shape in each interval by averaging over the conformations of all independent simulation runs in this interval.". However, probably the change in the flattening of Piezo channel happens very quickly during the simulations, possibly within the same interval. Is this the case? and if yes does this affect their calculations? 

      Unfortunately, the flattening is not sufficiently quick, so is not complete within the first time windows, see data points in Figure 4. We therefore report the time dependence with the plots in Figure 4 and extrapolate, see also our response above to reviewer 1.

      Finally, the authors use a specific lipid composition, which is asymmetric. Is it possible that the asymmetry of the membrane causes some of the changes in the curvature that they observe? Perhaps more controls, e.g. with a symmetric POPC bilayer are needed to identify whether membrane asymmetry plays a role in the membrane curvature they observe. 

      Because of the rather high computational demands, such controls are beyond our scope. We don’t expect statistically significant differences for symmetric POPC/cholesterol bilayers. On lines 229ff, we now state:

      “Our modelling assumes that any spontaneous curvature from asymmetries in the lipid composition is small compared to the curvature of the nanodome and, thus, negligible, which is plausible for the rather slight lipid asymmetry of our simulated membranes (see Methods).”

      Reviewer #3 (Public review): 

      Strengths: 

      This work focuses on a problem of deep significance: quantifying the structure-tension relationship and underlying mechanism for the mechanosensitive Piezo 1 and 2 channels. This objective presents a few technical challenges for molecular dynamics simulations, due to the relatively large size of each membrane-protein system. Nonetheless, the technical approach chosen is based on the methodology that is, in principle, established and widely accessible. Therefore, another group of practitioners would likely be able to reproduce these findings with reasonable effort. 

      Weaknesses: 

      The two main results of this paper are (1) that both channels exhibit a flatter structure compared to cryo-EM measurements, and (2) their estimated force vs. displacement relationship. Although the former correlates at least quantitatively with prior experimental work, the latter relies exclusively on simulation results and model parameters. 

      Below is a summary of the key points we recommend addressing in a revised version of the manuscript: 

      (1) The authors should report and discuss controls for the membrane energy calculations, specifically by increasing the density of the discretization graining. We also suggest validating the bending modulus used in the energy calculations for the specific lipid mixture employed in the study. 

      We have addressed both points, see our response to the reviewer’s comments for further details.

      (2) The authors should consider and discuss the potential limitations of the coarse-grained simulation force field and clarify how atomistic simulations validate the reported results, with a more detailed explanation of the potential interdependencies between the two. 

      We now discuss the caveats in the comparison of coarse-grained and atomistic simulations in more detail in a new paragraph starting on line 362.

      (3) The authors should provide further clarification on other points raised in the reviewers' comments, for instance, the potential role of membrane asymmetry. 

      We have done this – see above. We now further explain on lines 437ff why we use an asymmetric membrane. On lines 230ff, we discuss that any spontaneous membrane curvature due to lipid asymmetry is likely small compared to the nanodome curvature and, thus, negligible.

      Reviewer #1 (Recommendations for the authors): 

      (1) Report discretization dependence of the membrane energy (up to double the density of the current discretization graining). 

      We have added several text pieces in the paragraph “Excess area and bending energy” starting on line 583 in which we state how the results depend on the lattice constant a of the calculations.

      (2) Evaluate an analytical energy of a membrane bump with a shape similar to the simulation. This would be free of all sampling and discretization artifacts and would thus be an excellent lower bound of the energy. 

      We have done this for the curvature profile in Figure 1c and corresponding curvature profiles of the shape profiles in Figure 2d, see next text on lines 326ff.

      Minor: 

      (1)  The lipid density (Figure 1 right, 2c, 3c) is not interesting nor is it referred to. It can be dropped. 

      We think the lipid density maps are important for two reasons: First, they show the protein shape obtained after averaging conformations, as low-lipid-density regions. Second, the lipid densities are used in the calculation of the bending energies, to limit the bending energy calculations to the membrane in the nanodome, see Eq. 9. We therefore prefer to keep them.

      (2) Figure 7 is attractive but not used in a meaningful way. I suggest inserting the protein graphic from Figure 7 into Figure 1 with the 4-helix bundles numbered alongside the structure. Figure 7 could then be dropped. 

      Figure 7 is a figure of the Methods section. We need it to illustrate and explain aspects of the setup (numbering of helices, missing loops) and analysis (numbering scheme of 4-TM helix units).

      (3) Some editing of the use of the English language would be helpful. "Exemplary" is a bit of a funny word choice, it implies that the conformation is excellent, and not simply representative. I'd suggest "Representative conformation". 

      We agree and have replaced “exemplary” by “representative”.

      (4) Typos: 

      Equation 4 - Missing parentheses before squared operator inside the square root. 

      We have corrected this mistake.

      Reviewer #2 (Recommendations for the authors): 

      This study focuses mainly on Piezo2; the authors do not perform any atomistic simulations of Piezo1, and the coarse-grained simulations for Piezo1 are shorter. As a result, their analysis for Piezo2 seems more complete. It would be good if the authors did similar studies with Piezo1 as with Piezo2. 

      We agree that atomistic simulations of Piezo 1 would be interesting, too. However, because the atomistic simulations are particularly demanding, this is beyond our scope.

      Reviewer #3 (Recommendations for the authors): 

      (1) At line 63, a very large tension from the previous work by De Vecchis et al is reported (68 mN/m). The authors are sampling values up to about 21 mN/m, which is considerably smaller. However, these values greatly exceed what typical lipid membranes can sustain (about 10 mN/m) before rupturing. When mentioning these large tensions, the authors should emphasize that these values are not physiologically significant, because they would rupture most plasma membranes. That said, their use in simulation could be justified to magnify the structural changes compared to experiments. 

      We agree that our largest membrane tension values are unphysiological. However, we see a main novelty and relevance of our simulations in the fact that we obtain a response of the nanodome in the physiological range of membrane tensions, see e.g. the 3<sup>rd</sup> sentence of the abstract. Yes, we include simulations at tensions of 21 mN/m, but most of our simulated tension values are in the range from 0 to 10 mN/m (see e.g. Fig. 3e), in contrast to previous simulation studies.   

      (2) At line 78 and in the Methods, only the reference paper is for the CHARMM protein force field, but not for the lipid force field. 

      We have added the reference Klauda et al., 2010 for the CHARMM36 lipid force field in both spots.

      (3) (Line 83) Acknowledging that the authors needed to use the structure from micelles (because it has atomic resolution), how closely do their relaxed Piezo structures compare with the lowerresolution data from the MacKinnon and Patapoutian papers? 

      There are no structures reported in these papers to compare with, only a clear flattening as stated.  

      (4) (Line 99) The authors chose a slightly asymmetric lipid membrane composition to capture some specific plasma-membrane features. However, they do not discuss which features are described by this particular composition, which doesn't include different acyl-chain unsaturations between leaflets. Further, they do not seem to comment on whether there is enrichment of certain lipid species coupled to curvature, or whether there is any "scrambling" occurring when the dome section and the planar membrane are stitched together in the preparation phase (Figure 8). 

      Enrichment of lipids in contact with the protein is addressed in the reference Buyan et al., 2020, based on Martini simulations with Piezo 1. We have a different focus, but still wanted to keep an asymmetric membrane as in essentially all previous simulation studies as now stated also on lines 439ff, to mimic the native Piezo membrane environment. There is no apparent “scrambling” in the setup of our membrane systems. We also did not explore any coupling between curvature and lipid composition, but will publish the simulation trajectories to enable such studies.  

      (5) (Caption of Figure 2). Please comment briefly in the text why the tensionless simulation required a longer simulation run (e.g. larger fluctuations?) 

      We added as explanation on line 500 as explanation: “ … to explore the role of the long-range shape fluctuations in tensionless membranes for the relaxation into equilibrium”. The relaxation time of membrane shape fluctuations strongly increases with the wave length, which is only limited by the simulation box size in the absence of tensions. However, also for 8 microsecond trajectories, we do not observe complete equilibriation and therefore decided to extrapolate the excess area and bending energy values obtained for different time intervals of the trajectories.

      (6) (Caption of Figure 3). Please clarify in the Methods how the atomistic simulations were initialized were they taken from independent CG simulation snapshots? If not, the use of the adjective "independent" would be questionable given the very short atomistic simulation time length. 

      We now added that the production simulations started from the same structure. On lines 386, we now discuss the starting structure of the atomistic simulations in more detail.

      (7) (Line 202). The approach of discretizing the bilayer shape is reasonable, but no justification was provided for the 1-nm grid spacing. In my opinion, there should be a supporting figure showing how the bending energy varies with the grid spacing. 

      We now report also the effect of a 2-nm grid spacing on the results, see new text passages on page 18, and provide an explanation for the smaller 1-nm grid spacing on lines 587ff, where we write:

      “This lattice constant [a = 1 nm] is chosen to be smaller than the bin width of about 2nm used in determining the membrane shape of the simulation conformations, to take into account that the averaging of these membrane shapes can lead to a higher resolution compared to the 2 nm resolution of the individual membrane shapes.”

      (8) (Line 211). The choice by the authors to use a mixed lipid composition complicates the task of defining a reasonable bending modulus. Experimentally and in atomistic simulations, lipids with one saturated tail (like POPC or SOPC) are much stiffer when they are mixed with cholesterol (https://doi.org/10.1529/biophysj.105.067652, https://doi.org/10.1103/PhysRevE.80.021931, https://doi.org/10.1093/pnasnexus/pgad269). On the other hand, MARTINI seems to predict a slight *softening* for POPC mixed with cholesterol (https://doi.org/10.1038/s41467-023-43892-x). Further complicating this matter, mixtures of phospholipids with different preferred curvatures are predicted to be softer than pure bilayers (e.g. https://doi.org/10.1021/acs.jpcb.3c08117), but asymmetric bilayers are stiffer than symmetric ones in some circumstances (https://doi.org/10.1016/j.bpj.2019.11.3398). 

      This issue can be quite thorny: therefore, my recommendation would be to either: (a) directly compute k for their lipid composition, which is straightforward when using large CG bilayers (as was done in Fowler et al, 2016), but it would also require more advanced methods for the atomistic ones; (b) use a reasonable *experimental* value for k, based on a similar enough lipid composition. 

      We now justify in somewhat more detail why we use an asymmetric membrane, but agree that his complicates the bending energy estimates. We only aim to estimate the bending energy in the Martini 2.2 force field, because our elasticity model is based on and, thus, limited to results obtained with this force field. We have included the two further references using the Martini 2.2 force field suggested by the reviewer on line 213, and discuss now in more detail how the bending rigidity estimate enters and affects the modeling, see lines 226ff.  

      (9) (Line 224). Does this closing statement imply that all experimental work from ex-vivo samples describe Piezo states under some small but measurable tension? 

      We compare here to the cryo-EM structure in detergent micelles. So, there is no membrane tension, there may be a surface tension of the micelle, but we assume here that Piezo proteins are essentially force free in detergent micelles. Membrane embedding, in contrast, leads to strong forces on Piezo proteins already in the absence of membrane tension, because of the membrane bending energy.

      (10) (Line 304). The Discussion concludes with a reasonable point, albeit on a down note: could the authors elaborate on what kind of experimental approach may be able to verify their modeling results? 

      Very good question, but this is somewhat beyond our expertise. We don’t have a clear recommendation – it is complicated. What can be verified is the flattening, i.e. the height and curvature of the nanodome in lower-resolution experiments. We see our results in line with these experiments, see Introduction. 

      (11) (Line 331). The very title of the Majumder and Straub paper addresses the problem of excessive binding strength between protein beads in the MARTINI force field, which should be mentioned. Figure 3(d) shows that the atomistic systems have larger excess areas than the CG ones. This could be related to MARTINI's "stickiness", or just statistical sampling. Characterizing the grid spacing (see point 7 above) might help illuminate this. 

      We discuss now the larger excess area values of the atomistic simulations on lines 381ff.  

      (12) (Lines 367, 375). Are the harmonic restraints absolute position restraints or additional bonds?

      Note also that the schedule at which the restraints are released (10-ns intervals) is relatively quick. Does the membrane have enough time to equilibrate the number of lipids in each leaflet? 

      These are standard, absolute position restraints. The 10-ns intervals may be too short to fully equilibrate the numbers of lipids, we have not explored this. The main point in the setup was to have a reasonable TM helix embedding with a smooth membrane, without any rupturing. This turned out to be tricky, with the procedures illustrated in Figure 8 as solution. If the membrane is smooth, the lipid numbers quickly equilibrate either in the final relaxation or in the initial nanoseconds of the production runs.

      (13) (Line 387) The use of an isotropic barostat for equilibration further impedes the system's ability to relax its structure. I feel that the authors should validate more strongly their protocol to rule out the possibility that incomplete equilibration could bias dynamics towards flatter membranes, which is one of the main results of this paper. 

      We don’t see how choices in the initial relaxation steps could have affected our results, at least for the coarse-grained simulations. There is more and more flattening throughout all simulation trajectories, see e.g. the extrapolations in Figure 4. All initial simulation structures are significantly less flattened than the final structures in the production runs.

      (14) (Line 403). What is the protocol for reducing the membrane size for atomistic simulation? This is even more important to mention than for CG simulations. 

      We just cut lipids beyond the intended box size of the atomistic simulations. As a technical point, we now have also added on line 507 how PIP2 lipids were converted.

      (15) (Line 423). The CHARMM force field requires a cut-off distance of 12 Å for van der Waals forces, with a force-based continuous switching scheme. The authors should briefly comment on this deviation and its possible impact on membrane properties. Quick test simulations of very small atomistic bilayers with the chosen composition could be used as a comparison. 

      We don’t expect any relevant effect on membrane properties within the statistical accuracies of the quantities of interest here (i.e. excess areas).

      (16) (Equation 4). There are some mismatched parentheses: please check. 

      We have corrected this mistake.

      (17) (Equations 7-8). Why did the authors use finite-differences derivatives of z(x,y) instead of using cubic splines and the corresponding analytical derivatives? 

      In our experience, second derivatives of standard cubic splines can be problematic. The continuous membrane shapes we obtain in our analysis are averages of such splines. We find standard finite differences more reliable, and therefore discretize these shapes. Already for the 2d membrane profiles of Figure 1b and 2d, calculating curvatures from interpolations using splines is problematic.

    1. eLife Assessment

      In this revised version, the authors provide a thorough investigation of the interaction of megakaryocytes (MK) with their associated extracellular matrix (ECM) during maturation; they provide compelling evidence that the existence of a dense cage-like pericellular structure containing laminin γ1 and α4 and collagen IV is key to fixing the perisinusoidal localization of MK and preventing their premature intravasation. Adhesion of MK to this ECM cage is dependent on integrin beta1 and beta3 expressed by MK. This strong conclusion is based on the use of state-of-the art techniques such as the use of primary murine bone marrow MK cultures, mice lacking ECM receptors, namely integrin beta1 and beta3 null mice, as well as high-resolution 2D and 3D imaging. The study provides valuable insight into the role of cell-matrix interactions in MK maturation and provides an interesting model with practical implications for the fields of hemostasis and thrombosis

    2. Reviewer #1 (Public review):

      The authors report on a thorough investigation of the interaction of megakaryocytes (MK) with their associated ECM during maturation. They report convincing evidence to support the existence of a dense cage-like pericellular structure containing laminin γ1 and α4 and collagen IV, which interacts with integrins β1 and β3 on MK and serve to fix the perisinusoidal localization of MK and prevent their premature intravasation. As with everything in nature, the authors support a Goldilocks range of MK-ECM interactions - inability to digest the ECM via inhibition of MMPs leads to insufficient MK maturation and development of smaller MK. This important work sheds light into the role of cell-matrix interactions in MK maturation, and suggests that higher-dimensional analyses are necessary to capture the full scope of cellular biology in the context of their microenvironment. The authors have responded appropriately to the majority of my previous comments.

      Some remaining points:

      In a previous critique, I had suggested that "it is unclear how activation of integrins allows the MK to become "architects for their ECM microenvironment" as the authors posit. A transcriptomic analysis of control and DKO MKs may help elucidate these effects". The authors pointed out the technical difficulty of obtained sufficient numbers of MK for such analysis, which I accept, and instead analyzed mature platelets, finding no difference between control and DKO platelets. This is not necessarily surprising, since mature circulating platelets have no need to engage an ECM microenvironment, and for the same reason I would suggest that mature platelet analyses are not representative of MK behavior as regards ECM interactions.

    3. Reviewer #2 (Public review):

      Summary:

      This study makes a significant contribution to understanding the microenvironment of megakaryocytes (MKs) in the bone marrow, identifying an extracellular matrix (ECM) cage structure that influences MK localization and maturation. The authors provide compelling evidence for the presence of this ECM cage and its role in MK homeostasis, employing an array of sophisticated imaging techniques and molecular analyses.

      The authors have addressed most of the concerns raised in the previous review, providing clarifications and additional data that strengthen their conclusions

      More broadly, this work adds to a growing recognition of the ECM as an active participant in haematopoietic cell regulation in the bone marrow microenvironment. This work could pave the way to future studies investigating how the megakaryocytes' ECM cage affects their function as part of the haematopoietic stem cell niche, and by extension, influences global haematopoiesis.

    1. eLife Assessment

      This study presents important findings demonstrating that the internalization and degradation of FZD5 and FZD8, two of the ten Frizzled proteins, are WNT dependent and do not involve DVL. The evidence supporting the claims of the authors is convincing. This research will be of interest to biologists specializing in Wnt signaling, cancer, and regenerative medicine.

    2. Reviewer #1 (Public review):

      Summary:

      The mechanism by which WNT signals are received and transduced into the cell has been the topic of extensive research. Cell surface levels of the WNT receptors of the FZD family are subject to tight control and it's well established that the transmembrane ubiquitin ligases ZNRF3 and RNF43 target FZDs for degradation and that proteins of the R-spondin family block this effect. This manuscript explores the role that WNT proteins play in receptor internalization, recycling and degradation, and the authors provide evidence that WNTs promote interactions of FZD with the ubiquitin ligases. Using cells mutant in all 3 DVL genes, the authors demonstrate that this effect of WNT on FZD is DVL-independent.

      Strengths:

      Overall, the data are of good quality and support the authors' hypothesis. Strengths of this study is the use of CRISPR-mutated cell lines to establish genetic requirements for the various components. The finding that FZD internalization and degradation is WNT dependent and does not involve DVL is novel.

      Weaknesses:

      A weakness of the work includes a heavy reliance on overexpression of FZD proteins. To detect endogenous FZDs, the authors have inserted a V5 tag into the endogenous gene, which may affect their activity(ies).

    3. Reviewer #2 (Public review):

      In this manuscript Luo et al uncover that the ZNRF3/RNF43 E3 ubiquitin ligases participate in the selective endocytosis and degradation of FZD5/8 receptors in response to Wnt stimulation. In my opinion there are three significant findings of this study: 1) Wnt proteins are required for ZNRF3/RNF43 mediated endocytosis and degradation of FZD receptors and this constitutes an important negative regulatory loop. 2) Wnt can induce FZD endocytosis in the absence of ZNRF3/RNF43 but this does not influence total or cell surface levels. 3) The ZNRF3/RNF43 substrate selectivity for FZD5/8 over the other 8 Frizzleds. Of course, many questions remain, and new ones emerge as it is often the case, but these findings challenge our dogmatic view on how the ZNRF3/RNF43 regulate Wnt signaling and emphasize their role in Wnt-dependent Frizzled endocytosis/degradation and beta-catenin signaling.

      This is an elegant study employing several CRISPR-edited cell lines to tag endogenous Frizzled receptors and to knockout ZNRF3/RNF43 and all three Dishevelled proteins. One major strength of the study is therefore the careful assessment of the roles of RNF43 and ZNFR3 in endogenous expression contexts. This is especially relevant since overexpression of membrane E3 ligases have been shown to ectopically degrade membrane proteins and could have blurred previous interpretations. A second strength is clarifying the role of Dishevelled proteins in FZD endocytosis. Indeed, although previous studies suggested that the Wnt-promoted interaction between FZD and RNF43/ZNFR3 was mediated through Dvl, the authors clearly show that this is not the case (using Dvl knockout cells and functional assays). Dvl proteins, on the other han,d are still required for ligand-independent FZD-endocytosis.

      The only weakness pertains to the difference in signaling outcome, comparing elevated signaling seen when FZD levels are upregulated following ZNFR3/RNF43 KO vs ectopic overexpression. Indeed, the authors suggest that in the absence of RNF43/ZNFR3 the receptors could be recycled back to the PM and thereby contribute to increased signaling seen in the mutant cells. This has not been directly demonstrated.

    1. eLife Assessment

      This valuable study demonstrates that it is possible to decode information about characters and locations from single-unit responses in the human brain to a narrative movie, using a convincing technical approach to capture information in population-level dynamics. The study introduces an exciting dataset of single-unit responses in humans during a naturalistic and dynamic movie stimulus, with recordings from multiple regions within the medial temporal lobe. Using both a traditional firing-rate analysis as well as a population decoding analysis to connect these neural responses to the visual content of the movie, they show that in this dataset, the decoding of semantic scene features (e.g., the person currently on screen), but not scene transitions, is surprisingly driven by classically non-responsive neurons. Based on these findings, the authors argue that dynamic naturalistic semantic information may be processed within the medial temporal lobe at the population level.

    2. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Gerken et al examined how neurons in the human medial temporal lobe respond to and potentially code dynamic movie content. They had 29 patients watch a long-form movie while neurons within their MTL were monitored using depth electrodes. They found that neurons throughout the region were responsive to the content of the movie. In particular, neurons showed significant responses to people, places, and to a lesser extent, movie cuts. Modeling with a neural network suggests that neural activity within the recorded regions was better at predicting the content of the movies as a population, as opposed to individual neural representations. Surprisingly, a subpopulation of unresponsive neurons performed better than the responsive neurons at decoding the movie content, further suggesting that while classically nonresponsive, these neurons nonetheless provided critical information about the content of the visual world. The authors conclude from these results that low-level visual features, such as scene cuts, may be coded at the neuronal level, but that semantic features rely on distributed population-level codes.

      Strengths:

      Overall, the manuscript presents an interesting and reasonable argument for their findings and conclusions. Additionally, the large number of patients and neurons that were recorded and analyzed makes this data set unique and potentially very powerful. On the whole, the manuscript was very well written, and as it is, presents an interesting and useful set of data about the intricacies of how dynamic naturalistic semantic information may be processed within the medial temporal lobe.

      Weaknesses:

      There are a number of concerns I have based on some of the experimental and statistical methods employed that I feel would help to improve our understanding of the current data.

      In particular, the authors do not address the issue of superposed visual features very well throughout the manuscript. Previous research using naturalistic movies has shown that low-level visual features, particularly motion, are capable of driving much of the visual system (e.g, Bartels et al 2005; Bartels et al 2007; Huth et al 2012; Çukur et al 2013; Russ et al 2015; Nentwich et al 2023). In some of these papers, low-level features were regressed out to look at the influence of semantics, in others, the influence of low-level features was explicitly modeled. The current manuscript, for the most part, appears to ignore these features with the exception of scene cuts. Based on the previous evidence that low-level features continue to drive later cortical regions, it seems like including these as regressors of no interest or, more ideally, as additional variables, would help to determine how well MTL codes for semantic features over top of these lower-order variables.

      Following on this, much of the current analyses rely on the training of deep neural networks to decode particular features. The results of these analyses are illuminating, however, throughout the manuscript, I was increasingly wondering how the various variables interact with each other. For example, separate analyses were done for the patients, regions, and visual features. However, the logistic regression analysis that was employed could have all of these variables input together, obtaining beta weights for each one in an overall model. This would potentially provide information about how much each variable contributes to the overall decoding in relation to the others.

      A few more minor points that would help to clarify the current results involve the selection of data for particular analyses. For some analyses, the authors chose to appropriately downsample their data sets to compare across variables. However, there are a few places where similar downsampling would be informative, but was not completed. In particular, the analyses for patients and regions may have a more informative comparison if the full population were downsampled to match the size of the population for each patient or region of interest. This could be done with the Monte Carlo sampling that is used in other analyses, thus providing a control for population size while still sampling the full population.

    3. Reviewer #2 (Public review):

      Summary:

      This study introduces an exciting dataset of single-unit responses in humans during a naturalistic and dynamic movie stimulus, with recordings from multiple regions within the medial temporal lobe. The authors use both a traditional firing-rate analysis as well as a sophisticated decoding analysis to connect these neural responses to the visual content of the movie, such as which character is currently on screen.

      Strengths:

      The results reveal some surprising similarities and differences between these two kinds of analyses. For visual transitions (such as camera angle cuts), the neurons identified in the traditional response analysis (looking for changes in firing rate of an individual neuron at a transition) were the most useful for doing population-level decoding of these cuts. Interestingly, this wasn't true for character decoding; excluding these "responsive" neurons largely did not impact population-level decoding, suggesting that the population representation is distributed and not well-captured by individual-neuron analyses.

      The methods and results are well-described both in the text and in the figures. This work could be an excellent starting point for further research on this topic to understand the complex representational dynamics of single neurons during naturalistic perception.

      Weaknesses:

      (1) I am unsure what the central scientific questions of this work are, and how the findings should impact our understanding of neural representations. Among the questions listed in the introduction is "Which brain regions are informative for specific stimulus categories?". This is a broad research area that has been addressed in many neuroimaging studies for decades, and it's not clear that the results tell us new information about region selectivity. "Is the relevant information distributed across the neuronal population?" is also a question with a long history of work in neuroscience about localist vs distributed representations, so I did not understand what specific claim was being made and tested here. Responses in individual neurons were found for all features across many regions (e.g., Table S1), but decodable information was also spread across the population.

      (2) The character and indoor/outdoor labels seem fundamentally different from the scene/camera cut labels, and I was confused by the way that the cuts were put into the decoding framework. The decoding analyses took a 1600ms window around a frame of the video (despite labeling these as frame "onsets" like the feature onsets in the responsive-neuron analysis, I believe this is for any frame regardless of whether it is the onset of a feature), with the goal of predicting a binary label for that frame. Although this makes sense for the character and indoor/outdoor labels, which are a property of a specific frame, it is confusing for the cut labels since these are inherently about a change across frames. The way the authors handle this is by labeling frames as cuts if they are in the 520ms following a cut (there is no justification given for this specific value). Since the input to a decoder is 1600ms, this seems like a challenging decoding setup; the model must respond that an input is a "cut" if there is a cut-specific pattern present approximately in the middle of the window, but not if the pattern appears near the sides of the window. A more straightforward approach would be, for example, to try to discriminate between windows just after a cut versus windows during other parts of the video. It is also unclear how neurons "responsive" to cuts were defined, since the authors state that this was determined by looking for times when a feature was absent for 1000ms to continuously present for 1000ms, which would never happen for cuts (unless this definition was different for cuts?).

      (3) The architecture of the decoding model is interesting but needs more explanation. The data is preprocessed with "a linear layer of same size as the input" (is this a layer added to the LSTM that is also trained for classification, or a separate step?), and the number of linear layers after the LSTM is "adapted" for each label type (how many were used for each label?). The LSTM also gets to see data from 800 ms before and after the labeled frame, but usually LSTMs have internal parameters that are the same for all timesteps; can the model know when the "critical" central frame is being input versus the context, i.e., are the inputs temporally tagged in some way? This may not be a big issue for the character or location labels, which appear to be contiguous over long durations and therefore the same label would usually be present for all 1600ms, but this seems like a major issue for the cut labels since the window will include a mix of frames with opposite labels.

      (4) Because this is a naturalistic stimulus, some labels are very imbalanced ("Persons" appears in almost every frame), and the labels are correlated. The authors attempt to address the imbalance issue by oversampling the minority class during training, though it's not clear this is the right approach since the test data does not appear to be oversampled; for example, training the Persons decoder to label 50% of training frames as having people seems like it could lead to poor performance on a test set with nearly 100% Persons frames, versus a model trained to be biased toward the most common class. There is no attempt to deal with correlated features, which is especially problematic for features like "Summer Faces" and "Summer Presence", which I would expect to be highly overlapping, making it more difficult to interpret decoding performance for specific features.

      (5) Are "responsive" neurons defined as only those showing firing increases at a feature onset, or would decreased activity also count as responsive? If only positive changes are labeled responsive, this would help explain how non-responsive neurons could be useful in a decoding analysis.

      (6) Line 516 states that the scene cuts here are analogous to the hard boundaries in Zheng et al. (2022), but the hard boundaries are transitions between completely unrelated movies rather than scenes within the same movie. Previous work has found that within-movie and across-movie transitions may rely on different mechanisms, e.g., see Lee & Chen, 2022 (10.7554/eLife.73693).

    4. Reviewer #3 (Public review):

      This is an excellent, very interesting paper. There is a groundbreaking analysis of the data, going from typical picture presentation paradigms to more realistic conditions. I would like to ask the authors to consider a few points in the comments below.

      (1) From Figure 2, I understand that there are 7 neurons responding to the character Summer, but then in line 157, we learn that there are 46. Are the other 39 from other areas (not parahippocampal)? If this is the case, it would be important to see examples of these responses, as one of the main claims is that it is possible to decode as good or better with non-responsive compared to single responsive neurons, which is, in principle, surprising.

      (2) Also in Figure 2, there seem to be relatively very few neurons responding to Summer (1.88%) and to outdoor scenes (1.07%). Is this significant? Isn't it also a bit surprising, particularly for outdoor scenes, considering a previous paper of Mormann showing many outdoor scene responses in this area? It would be nice if the authors could comment on this.

      (3) I was also surprised to see that there are many fewer responses to scene cuts (6.7%) compared to camera cuts (51%) because every scene cut involves a camera cut. Could this have been a result of the much larger number of camera cuts? (A way to test this would be to subsample the camera cuts.)

      (4) Line 201. The analysis of decoding on a per-patient basis is important, but it should be done on a per-session basis - i.e., considering only simultaneously recorded neurons, without any pooling. This is because pooling can overestimate decoding performances (see e.g. Quian Quiroga and Panzeri NRN 2009). If there was only one session per patient, then this should be called 'per-session' rather than 'per-patient' to make it clear that there was no pooling.

      (5) In general, the decoding results are quite interesting, and I was wondering if the authors could give a bit more insight by showing confusion matrices, with the predictions of the appearance of each of the characters, etc. Some of the characters may appear together, so this could be another entry of the decoder (say, predicting person A, B, C, A&B, A&C, B&C, A&B&C). I guess this could also show the power of analyzing the population activity.

      (6) Lines 406-407. The claim that stimulus-selective responses to characters did not account for the decoding of the same character is very surprising. If I understood it correctly, the response criterion the authors used gives 'responsiveness' but not 'selectivity'. So, were people's responses selective (e.g., firing only to Summer) or non-selective (firing to a few characters)? This could explain why they didn't get good decoding results with responsive neurons. Again, it would be nice to see confusion matrices with the decoding of the characters. Another reason for this is that what are labelled as responsive neurons have relatively weak and variable responses.

      (7) Line 455. The claim that 500 neurons drive decoding performance is very subjective. 500 neurons gives a performance of 0.38, and 50 neurons gives 0.33.

      (8) Lines 492-494. I disagree with the claim that "character decoding does not rely on individual cells, as removing neurons that responded strongly to character onset had little impact on performance". I have not seen strong responses to characters in the paper. In particular, the response to Summer in Figure 2 looks very variable and relatively weak. If there are stronger responses to characters, please show them to make a convincing argument. It is fine to argue that you can get information from the population, but in my view, there are no good single-cell responses (perhaps because the actors and the movie were unknown to the subjects) to make this claim. Also, an older paper (Quian Quiroga et al J. Neurophysiol. 2007) showed that the decoding of individual stimuli in a picture presentation paradigm was determined by the responsive neurons and that the non-responsive neurons did not add any information. The results here could be different due to the use of movies instead of picture presentations, but most likely due to the fact that, in the picture presentation paradigm, the pictures were of famous people for which there were strong single neuron responses, unlike with the relatively unknown persons in this paper.

    5. Author response:

      Reviewer #1 (Public review):

      Summary:

      In this manuscript, Gerken et al examined how neurons in the human medial temporal lobe respond to and potentially code dynamic movie content. They had 29 patients watch a long-form movie while neurons within their MTL were monitored using depth electrodes. They found that neurons throughout the region were responsive to the content of the movie. In particular, neurons showed significant responses to people, places, and to a lesser extent, movie cuts. Modeling with a neural network suggests that neural activity within the recorded regions was better at predicting the content of the movies as a population, as opposed to individual neural representations. Surprisingly, a subpopulation of unresponsive neurons performed better than the responsive neurons at decoding the movie content, further suggesting that while classically nonresponsive, these neurons nonetheless provided critical information about the content of the visual world. The authors conclude from these results that low-level visual features, such as scene cuts, may be coded at the neuronal level, but that semantic features rely on distributed population-level codes.

      Strengths:

      Overall, the manuscript presents an interesting and reasonable argument for their findings and conclusions. Additionally, the large number of patients and neurons that were recorded and analyzed makes this data set unique and potentially very powerful. On the whole, the manuscript was very well written, and as it is, presents an interesting and useful set of data about the intricacies of how dynamic naturalistic semantic information may be processed within the medial temporal lobe.

      We thank the reviewer for their comments on our manuscript and for describing the strengths of our presented work

      Weaknesses:

      There are a number of concerns I have based on some of the experimental and statistical methods employed that I feel would help to improve our understanding of the current data.

      In particular, the authors do not address the issue of superposed visual features very well throughout the manuscript. Previous research using naturalistic movies has shown that low-level visual features, particularly motion, are capable of driving much of the visual system (e.g, Bartels et al 2005; Bartels et al 2007; Huth et al 2012; Çukur et al 2013; Russ et al 2015; Nentwich et al 2023). In some of these papers, low-level features were regressed out to look at the influence of semantics, in others, the influence of low-level features was explicitly modeled. The current manuscript, for the most part, appears to ignore these features with the exception of scene cuts. Based on the previous evidence that low-level features continue to drive later cortical regions, it seems like including these as regressors of no interest or, more ideally, as additional variables, would help to determine how well MTL codes for semantic features over top of these lower-order variables.

      We thank the reviewer for this insightful comment and for the relevant literature regarding visual motion in not only the primary visual system but in cortical areas as well. While we agree that the inclusion of visual motion as a regressor of no interest or as an additional variable would be overall informative in determining if single neurons in the MTL are driven by this level of feature, we would argue that our analyses already provide some insight into its role and that only the parahippocampal cortical neurons would robustly track this feature.

      As noted by the reviewer, our model includes two features derived from visual motion: Camera Cuts (directly derived from frame-wise changes in pixel values)  and Scene Cuts (a subset of Camera Cuts restricted to changes in scene). As shown in Fig. 5a, decoding performance for these features was strongest in the parahippocampal cortex (~20%), compared to other MTL areas (~10%). While the entorhinal cortex also showed some performance for Scene Cuts (15%), we interpret this as being driven by the changes in location that define a scene, rather than by motion itself.

      These findings suggest that while motion features are tracked in the MTL, the effect may be most robust in the parahippocampal cortex. We believe that quantifying more complex 3D motion in a naturalistic stimulus like a full-length movie is a significant challenge that would likely require a dedicated study. We agree this is an interesting future research direction and will update the manuscript to highlight this for the reader.

      A few more minor points that would help to clarify the current results involve the selection of data for particular analyses. For some analyses, the authors chose to appropriately downsample their data sets to compare across variables. However, there are a few places where similar downsampling would be informative, but was not completed. In particular, the analyses for patients and regions may have a more informative comparison if the full population were downsampled to match the size of the population for each patient or region of interest. This could be done with the Monte Carlo sampling that is used in other analyses, thus providing a control for population size while still sampling the full population.

      We thank the reviewer for raising this important methodological point. The decision not to downsample the patient- and region-specific analyses was deliberate, and we appreciate the opportunity to clarify our rationale.

      Generally, we would like to emphasize that due to technical and ethical limitations of human single-neuron recordings, it is currently not possible to record large populations of neurons simultaneously in individual patients. The limited and variable number of recorded neurons per subject (Fig. S1) generally requires pooling neurons into a pseudo-populations for decoding, which is a well‐established standard in human single‐neuron studies (see e.g., (Jamali et al., 2021; Kamiński et al., 2017; Minxha et al., 2020; Rutishauser et al., 2015; Zheng et al., 2022)).

      For the patient-specific analysis, our primary goal was to show that no single patient's data could match the performance of the complete pseudo-population. Crucially, we found no direct relationship between the number of recorded neurons and decoding performance; patients with the most neurons (patients 4, 13) were not top performers, and those with the fewest (patients 11, 14) were not the worst (see Fig. 4). This indicates that neuron count was not the primary limiting factor and that downsampling would be unlikely to provide additional insight.

      Similarly, for the region-specific analysis, regions with larger neural populations did not systematically outperform those with fewer neurons (Fig. 5). Given the inherent sparseness of single-neuron data, we concluded that retaining the full dataset was more informative than excluding neurons simply to equalize population sizes.

      We agree that this methodological choice should be transparent and explicitly justified in the text. We will add an explanation to the revised manuscript to justify why this approach was taken and how it differs from the analysis in Fig. 6.

      Reviewer #2 (Public review):

      Summary:

      This study introduces an exciting dataset of single-unit responses in humans during a naturalistic and dynamic movie stimulus, with recordings from multiple regions within the medial temporal lobe. The authors use both a traditional firing-rate analysis as well as a sophisticated decoding analysis to connect these neural responses to the visual content of the movie, such as which character is currently on screen.

      Strengths:

      The results reveal some surprising similarities and differences between these two kinds of analyses. For visual transitions (such as camera angle cuts), the neurons identified in the traditional response analysis (looking for changes in firing rate of an individual neuron at a transition) were the most useful for doing population-level decoding of these cuts. Interestingly, this wasn't true for character decoding; excluding these "responsive" neurons largely did not impact population-level decoding, suggesting that the population representation is distributed and not well-captured by individual-neuron analyses.

      The methods and results are well-described both in the text and in the figures. This work could be an excellent starting point for further research on this topic to understand the complex representational dynamics of single neurons during naturalistic perception.

      We thank the reviewer for their feedback and for summarizing the results of our work.

      (1) I am unsure what the central scientific questions of this work are, and how the findings should impact our understanding of neural representations. Among the questions listed in the introduction is "Which brain regions are informative for specific stimulus categories?". This is a broad research area that has been addressed in many neuroimaging studies for decades, and it's not clear that the results tell us new information about region selectivity. "Is the relevant information distributed across the neuronal population?" is also a question with a long history of work in neuroscience about localist vs distributed representations, so I did not understand what specific claim was being made and tested here. Responses in individual neurons were found for all features across many regions (e.g., Table S1), but decodable information was also spread across the population.

      We thank the reviewer for this important point, which gets to the core of our study's contribution. While concepts like regional specificity are well-established from studies on the blood-flow level, their investigation at the single-neuron level in humans during naturalistic, dynamic stimulation remains a critical open question. The type of coding (sparse vs. distributed) on the other hand cannot be investigated with blood-flow studies as the technology lacks the spatial and temporal resolution.

      Our study addresses this gap directly. The exceptional temporal resolution of single-neuron recordings allows us to move beyond traditional paradigms and examine cellular-level dynamics as they unfold in neuronal response on a frame-by-frame basis to a more naturalistic and ecologically valid stimulus. It cannot be assumed that findings from other modalities or simplified stimuli will generalize to this context.

      To meet this challenge, we employed a dual analytical strategy: combining a classic single-unit approach with a machine learning-based population analysis. This allowed us to create a bridge between prior work and our more naturalistic data. A key result is that our findings are often consistent with the existing literature, which validates the generalizability of those principles. However, the differences we observe between these two analytical approaches are equally informative, providing new insights into how the brain processes continuous, real-world information.

      We will revise the introduction and discussion to more explicitly frame our work in this context, emphasizing the specific scientific question driving this study, while also highlighting the strengths of our experimental design and recording methods.

      (2) The character and indoor/outdoor labels seem fundamentally different from the scene/camera cut labels, and I was confused by the way that the cuts were put into the decoding framework. The decoding analyses took a 1600ms window around a frame of the video (despite labeling these as frame "onsets" like the feature onsets in the responsive-neuron analysis, I believe this is for any frame regardless of whether it is the onset of a feature), with the goal of predicting a binary label for that frame. Although this makes sense for the character and indoor/outdoor labels, which are a property of a specific frame, it is confusing for the cut labels since these are inherently about a change across frames. The way the authors handle this is by labeling frames as cuts if they are in the 520ms following a cut (there is no justification given for this specific value). Since the input to a decoder is 1600ms, this seems like a challenging decoding setup; the model must respond that an input is a "cut" if there is a cut-specific pattern present approximately in the middle of the window, but not if the pattern appears near the sides of the window. A more straightforward approach would be, for example, to try to discriminate between windows just after a cut versus windows during other parts of the video. It is also unclear how neurons "responsive" to cuts were defined, since the authors state that this was determined by looking for times when a feature was absent for 1000ms to continuously present for 1000ms, which would never happen for cuts (unless this definition was different for cuts?).

      We thank the reviewer for the valuable comment regarding specifically the cut labels. The choice to label frames that lie in a time window of 520ms following a cut as positive was selected based on prior research and is intended to include the response onsets across all regions within the MTL (Mormann et al., 2008). We agree that this explanation is currently missing from the manuscript, and we will add a brief clarification in the revised version.

      As correctly noted, the decoding analysis does not rely on feature onset but instead continuously decodes features throughout the entire movie. Thus, all frames are included, regardless of whether they correspond to a feature onset.

      Our treatment of cut labels as sustained events is a deliberate methodological choice. Neural responses to events like cuts often unfold over time, and by extending the label, we provide our LSTM network with the necessary temporal window to learn this evolving signature. This approach not only leverages the sequential processing strengths of the LSTM (Hochreiter et al., 1997) but also ensures a consistent analytical framework for both event-based (cuts) and state-based (character or location) features.

      (3) The architecture of the decoding model is interesting but needs more explanation. The data is preprocessed with "a linear layer of same size as the input" (is this a layer added to the LSTM that is also trained for classification, or a separate step?), and the number of linear layers after the LSTM is "adapted" for each label type (how many were used for each label?). The LSTM also gets to see data from 800 ms before and after the labeled frame, but usually LSTMs have internal parameters that are the same for all timesteps; can the model know when the "critical" central frame is being input versus the context, i.e., are the inputs temporally tagged in some way? This may not be a big issue for the character or location labels, which appear to be contiguous over long durations and therefore the same label would usually be present for all 1600ms, but this seems like a major issue for the cut labels since the window will include a mix of frames with opposite labels.

      We thank the reviewer for their insightful comments regarding the decoding architecture. The model consists of an LSTM followed by 1–3 linear readout layers, where the exact number of layers is treated as a hyperparameter and selected based on validation performance for each label type. The initial linear layer applied to the input is part of the trainable model and serves as a projection layer to transform the binned neural activity into a suitable feature space before feeding it into the LSTM. The model is trained in an end-to-end fashion on the classification task.

      Regarding temporal context, the model receives a 1600 ms window (800 ms before and after the labeled frame), and as correctly pointed out by the reviewer, LSTM parameters are shared across time steps. We do not explicitly tag the temporal position of the central frame within the sequence. While this may have limited impact for labels that persist over time (e.g., characters or locations), we agree this could pose a challenge for cut labels, which are more temporally localized.

      This is an important point, and we will clarify this limitation in the revised manuscript and consider incorporating positional encoding in future work to better guide the model’s focus within the temporal window. Additionally, we will add a data table, specifying the ranges of hyperparameters in our decoding networks. Hyperparameters were optimized for each feature and split individually, but we agree that some more details on how these parameters were chosen are important and we will provide a data table in our revised manuscript giving more insights into the ranges of hyperparameters.

      We thank the reviewer for this important point. We will clarify this limitation in the revised manuscript and note that positional encoding is a valuable direction to better guide the model’s focus within the temporal window. To improve methodological transparency, we will also add a supplementary table detailing the hyperparameter ranges used for our optimization process.

      (4) Because this is a naturalistic stimulus, some labels are very imbalanced ("Persons" appears in almost every frame), and the labels are correlated. The authors attempt to address the imbalance issue by oversampling the minority class during training, though it's not clear this is the right approach since the test data does not appear to be oversampled; for example, training the Persons decoder to label 50% of training frames as having people seems like it could lead to poor performance on a test set with nearly 100% Persons frames, versus a model trained to be biased toward the most common class. [...]

      We thank the reviewer for this critical and thoughtful comment. We agree that the imbalanced and correlated nature of labels in naturalistic stimuli is a key challenge.

      To address this, we follow a standard machine learning practice: oversampling is applied exclusively to the training data. This technique helps the model learn from underrepresented classes by creating more balanced training batches, thus preventing it from simply defaulting to the majority class. Crucially, the test set remains unaltered to ensure our evaluation reflects the model's true generalization performance on the natural data distribution.

      For the “Persons” feature, which appears in nearly all frames, defining a meaningful negative class is particularly challenging. The decoder must learn to identify subtle variations within a highly skewed distribution. Oversampling during training helps provide a more balanced learning signal, while keeping the test distribution intact ensures proper evaluation of generalization.

      The reviewer’s comment—that we are “training the Persons decoder to label 50% of training frames as having people”—may suggest that labels were modified. We want to emphasize this is not the case. Our oversampling strategy does not alter the labels; it simply increases the exposure of the rare, underrepresented class during training to ensure the model can learn its pattern despite its low frequency.

      We will revise the Methods section to describe this standard procedure more explicitly, clarifying that oversampling is a training-only strategy to mitigate class imbalance.

      (5) Are "responsive" neurons defined as only those showing firing increases at a feature onset, or would decreased activity also count as responsive? If only positive changes are labeled responsive, this would help explain how non-responsive neurons could be useful in a decoding analysis.

      We define responsive neurons as those showing increased firing rates at feature onset; we did not test for decreases in activity. We thank the reviewer for this valuable comment and will address this point in the revised manuscript by assessing responseness without a restriction on the direction of the firing rate.

      (6) Line 516 states that the scene cuts here are analogous to the hard boundaries in Zheng et al. (2022), but the hard boundaries are transitions between completely unrelated movies rather than scenes within the same movie. Previous work has found that within-movie and across-movie transitions may rely on different mechanisms, e.g., see Lee & Chen, 2022 (10.7554/eLife.73693).

      We thank the reviewer for pointing out this distinction and for including the relevant work from Lee & Chan (2022) which further contextualizes this distinction. Indeed, the hard boundaries defined in the cited paper differ slightly from ours. The study distinguishes between (1) hard boundaries—transitions between unrelated movies—and (2) soft boundaries—transitions between related events within the same movie. While our camera cuts resemble their soft boundaries, our scene cuts do not fully align with either category. We defined scene cuts to be more similar to the study’s hard boundaries, but we recognize this correspondence is not exact. We will clarify the distinctions between our scene cuts and the hard boundaries described in Zheng et al. (2022) in the revised manuscript, and will update our text to include the finding from Lee & Chan (2022).

      Reviewer #3 (Public review):

      This is an excellent, very interesting paper. There is a groundbreaking analysis of the data, going from typical picture presentation paradigms to more realistic conditions. I would like to ask the authors to consider a few points in the comments below.

      (1) From Figure 2, I understand that there are 7 neurons responding to the character Summer, but then in line 157, we learn that there are 46. Are the other 39 from other areas (not parahippocampal)? If this is the case, it would be important to see examples of these responses, as one of the main claims is that it is possible to decode as good or better with non-responsive compared to single responsive neurons, which is, in principle, surprising.

      We thank the reviewer for pointing out this ambiguity in the text. Yes, the other 39 units are responsive neurons from other areas. We will clarify to which neuronal sets the number of responsive neurons corresponds. We will also include response plots depicting the unit activity for the mentioned units.

      (2) Also in Figure 2, there seem to be relatively very few neurons responding to Summer (1.88%) and to outdoor scenes (1.07%). Is this significant? Isn't it also a bit surprising, particularly for outdoor scenes, considering a previous paper of Mormann showing many outdoor scene responses in this area? It would be nice if the authors could comment on this.

      We thank the reviewer for this insightful point. While a low response to the general 'outdoor scene' label seems surprising at first, our findings align with the established role of the parahippocampal cortex (PHC) in processing scenes and spatial layouts. In previous work using static images, each image introduces a new spatial context. In our movie stimulus, new spatial contexts specifically emerge at scene cuts. Accordingly, our data show a strong PHC response precisely at these moments. We will revise the discussion to emphasize this interpretation, highlighting the consistency with prior work.

      Regarding the first comment, we did not originally test if the proportion of the units is significant using e.g. a binomial test. We will include the results of a binomial test for each region and feature pair in the revised manuscript.

      (3) I was also surprised to see that there are many fewer responses to scene cuts (6.7%) compared to camera cuts (51%) because every scene cut involves a camera cut. Could this have been a result of the much larger number of camera cuts? (A way to test this would be to subsample the camera cuts.)

      The decrease in responsive units for scene cuts relative to camera cuts could indeed be due to the overall decrease in “trials” from one label to the other. To test this, we will follow the reviewer’s suggestion and perform tests using sets of randomly subsampled camera cuts and will include the results in the revised manuscript.

      (4) Line 201. The analysis of decoding on a per-patient basis is important, but it should be done on a per-session basis - i.e., considering only simultaneously recorded neurons, without any pooling. This is because pooling can overestimate decoding performances (see e.g. Quian Quiroga and Panzeri NRN 2009). If there was only one session per patient, then this should be called 'per-session' rather than 'per-patient' to make it clear that there was no pooling.

      The per-patient decoding was indeed also a per-session decoding, as each patient contributed only a single session to the dataset. We will make note of this explicitly in the text to resolve the ambiguity.

      (6) Lines 406-407. The claim that stimulus-selective responses to characters did not account for the decoding of the same character is very surprising. If I understood it correctly, the response criterion the authors used gives 'responsiveness' but not 'selectivity'. So, were people's responses selective (e.g., firing only to Summer) or non-selective (firing to a few characters)? This could explain why they didn't get good decoding results with responsive neurons. Again, it would be nice to see confusion matrices with the decoding of the characters. Another reason for this is that what are labelled as responsive neurons have relatively weak and variable responses.

      We thank the reviewer for pointing out the importance of selectivity in addition to responsiveness. Indeed, our response criterion does not take stimulus selectivity into account and exclusively measures increases in firing activity after feature onsets for a given feature irrespective of other features.

      We will adjust the text to reflect this shortcoming of the response-detection approach used here. To clarify the relationship between neural populations, we will add visualizations of the overlap of responsive neurons across labels for each subregion. These figures will be included in the revised manuscript.

      In our approach, we trained separate networks for each feature to effectively mitigate the issue of correlated feature labels within the dataset (see earlier discussion). While this strategy effectively deals with the correlated features, it precluded the generation of standard confusion matrices, as classification was performed independently for each feature.

      To directly assess the feature selectivity of responsive neurons, we will fit generalized linear models to predict their firing rates from the features. This approach will enable us to quantify their selectivity and compare it to that of the broader neuronal population.

      (7) Line 455. The claim that 500 neurons drive decoding performance is very subjective. 500 neurons gives a performance of 0.38, and 50 neurons gives 0.33.

      We agree with the reviewer that the phrasing is unclear. We will adjust our summary of this analysis as given in Line 455 to reflect that the logistic regression-derived neuronal rankings produce a subset which achieve comparable performance.

      (8) Lines 492-494. I disagree with the claim that "character decoding does not rely on individual cells, as removing neurons that responded strongly to character onset had little impact on performance". I have not seen strong responses to characters in the paper. In particular, the response to Summer in Figure 2 looks very variable and relatively weak. If there are stronger responses to characters, please show them to make a convincing argument. It is fine to argue that you can get information from the population, but in my view, there are no good single-cell responses (perhaps because the actors and the movie were unknown to the subjects) to make this claim. Also, an older paper (Quian Quiroga et al J. Neurophysiol. 2007) showed that the decoding of individual stimuli in a picture presentation paradigm was determined by the responsive neurons and that the non-responsive neurons did not add any information. The results here could be different due to the use of movies instead of picture presentations, but most likely due to the fact that, in the picture presentation paradigm, the pictures were of famous people for which there were strong single neuron responses, unlike with the relatively unknown persons in this paper.

      This is an important point and we thank the reviewer for highlighting a previous paradigm in which responsive neurons did drive decoding performance. Indeed, the fact that the movie, its characters and the corresponding actors were novel to patients could explain the disparity in decoding performance by way of weaker and more variable responses. We will include additional examples in the supplement of responses to features. Additionally, we will modify the text to emphasize the point that reliable decoding is possible even in the absence of a robust set of neuronal responses. It could indeed be the case that a decoder would place more weight on responsive units if they were present (as shown in the mentioned paper and in our decoding from visual transitions in the parahippocampal cortex).

    1. eLife Assessment

      This study presents valuable findings by demonstrating that specific GPCR subtypes induce distinct extracellular vesicle miRNA signatures, highlighting a potential novel mechanism for intercellular communication with implications for receptor pharmacology within the field. The data is compelling, however, more evidence is needed to determine whether the distinct extracellular vesicle miRNA signatures result from GPCR-dependent miRNA expression or GPCR-dependent incorporation of miRNAs into extracellular vesicles.

    2. Reviewer #1 (Public review):

      Summary:

      In this manuscript, the authors explore a novel concept: GPCR-mediated regulation of miRNA release via extracellular vesicles (EVs). They perform an EV miRNA cargo profiling approach to investigate how specific GPCR activations influence the selective secretion of particular miRNAs. Given that GPCRs are highly diverse and orchestrate multiple cellular pathways - either independently or collectively - to regulate gene expression and cellular functions under various conditions, it is logical to expect alterations in gene and miRNA expression within target cells.

      Strengths:

      The novel idea of GPCRs-mediated control of EV loading of miRNAs.

      Weaknesses:

      Incomplete findings failed to connect and show evidence of any physiological parameters that are directly related to the observed changes. The mechanical detail is lacking.

      The manuscript falls short of providing a comprehensive understanding. Identifying changes in cellular and EV-associated miRNAs without elucidating their physiological significance or underlying regulatory mechanisms limits the study's impact. Without demonstrating whether these miRNA alterations have functional consequences, the findings alone are insufficient. The findings may be suitable for more specialized journals.

      Furthermore, a critical analysis of the relationship between cellular miRNA levels and EV miRNA cargo is essential. Specifically, comparing the intracellular and EV-associated miRNA pools could reveal whether specific miRNAs are preferentially exported, a behavior that should be inversely related to their cellular abundance if export serves a beneficial function by reducing intracellular levels. This comparison is vital to strengthen the biological relevance of the findings and support the proposed regulatory mechanisms by GPCRs.

    3. Reviewer #2 (Public review):

      Summary:

      This study examines how activating specific G protein-coupled receptors (GPCRs) affects the microRNA (miRNA) profiles within extracellular vesicles (EVs). The authors seek to identify whether different GPCRs produce unique EV miRNA signatures and what these signatures could indicate about downstream cellular processes and pathological processes.

      Methods:

      (1) Used U2OS human osteosarcoma cells, which naturally express multiple GPCR types.

      (2) Stimulated four distinct GPCRs (ADORA1, HRH1, FZD4, ACKR3) using selective agonists.

      (3) Isolated EVs from culture media and characterized them via size exclusion chromatography, immunoblotting, and microscopy.

      (4) Employed qPCR-based miRNA profiling and bioinformatics analyses (e.g., KEGG, PPI networks) to interpret expression changes.

      Key Findings:

      (1) No significant change in EV quantity or size following GPCR activation.

      (2) Each GPCR triggered a distinct EV miRNA expression profile.

      (3) miRNAs differentially expressed post-stimulation were linked to pathways involved in cancer, insulin resistance, neurodegenerative diseases, and other physiological/pathological processes.

      (4) miRNAs such as miR-550a-5p, miR-502-3p, miR-137, and miR-422a emerged as major regulators following specific receptor activation.

      Conclusions:

      The study offers evidence that GPCR activation can regulate intercellular communication through miRNAs encapsulated within extracellular vesicles (EVs). This finding paves the way for innovative drug-targeting strategies and enhances understanding of drug side effects that are mediated via GPCR-related EV signaling.

      Strengths:

      (1) Innovative concept: The idea of linking GPCR signaling to EV miRNA content is novel and mechanistically important.

      (2) Robust methodology: The use of multiple validation methods (biochemical, biophysical, and statistical) lends credibility to the findings.

      (3) Relevance: GPCRs are major drug targets, and understanding off-target or systemic effects via EVs is highly valuable for pharmacology and medicine.

      Weaknesses:

      (1) Sample Size & Scope: The analysis included only four GPCRs. Expanding to more receptor types or additional cell lines would enhance the study's applicability.

      (2) Exploratory Nature: This study is primarily descriptive and computational. It lacks functional validation, such as assessing phenotypic effects in recipient cells, which is acknowledged as a future step.

      (3) EV heterogeneity: The authors recognize that they did not distinguish EV subpopulations, potentially confounding the origin and function of miRNAs.

    4. Author response:

      Reviewer #1 (Public review):

      Summary:

      In this manuscript, the authors explore a novel concept: GPCR-mediated regulation of miRNA release via extracellular vesicles (EVs). They perform an EV miRNA cargo profiling approach to investigate how specific GPCR activations influence the selective secretion of particular miRNAs. Given that GPCRs are highly diverse and orchestrate multiple cellular pathways - either independently or collectively - to regulate gene expression and cellular functions under various conditions, it is logical to expect alterations in gene and miRNA expression within target cells.

      Strengths:

      The novel idea of GPCRs-mediated control of EV loading of miRNAs.

      Weaknesses:

      Incomplete findings failed to connect and show evidence of any physiological parameters that are directly related to the observed changes. The mechanical detail is lacking.

      We appreciate the reviewer's acknowledgment of the novelty of this study. We agree with the reviewer that further mechanistic insights would strengthen the manuscript. The mechanisms by which miRNA is sorted into EVs remain poorly understood. Various factors, including RNA-binding protein, sequence motifs, and cellular location, can influence this sorting process(Garcia-Martin et al., 2022; Liu & Halushka, 2025; Villarroya-Beltri et al., 2013; Yoon et al., 2015). Ago2, a key component of the RNA-induced silencing complexes, binds to miRNA and facilitates miRNA sorting. Ago2 has been found in the EVs and can be regulated by the cellular signaling pathway.  For instance, McKenzie et al. demonstrated that KRAS-dependent activation of MEK-ERK can phosphorylate Ago2 protein, thereby regulating the sorting of specific miRNAs into EVs(McKenzie et al., 2016). In the differentiated PC12 cells, Gαq activation leads to the formation of Ago2-associated granules, which selectively sequester unique transcripts(Jackson et al., 2022). Investigating GPCR, G protein, and GPCR signaling on Ago2 expression, location, and phosphorylation states could provide valuable insights into how GPCRs regulate specific miRNAs within EVs. We have expanded these potential mechanisms and future research in the discussion section.

      The manuscript falls short of providing a comprehensive understanding. Identifying changes in cellular and EV-associated miRNAs without elucidating their physiological significance or underlying regulatory mechanisms limits the study's impact. Without demonstrating whether these miRNA alterations have functional consequences, the findings alone are insufficient. The findings may be suitable for more specialized journals.

      Thank you for the feedback. We acknowledge that validating the target genes of the top candidate miRNAs is an important next step. In response to the reviewer's concerns, we have expanded the discussion of future research in the manuscript. Although this initial study is primarily descriptive, it establishes a novel conceptual link between GPCR signaling and EV-mediated communication.

      Furthermore, a critical analysis of the relationship between cellular miRNA levels and EV miRNA cargo is essential. Specifically, comparing the intracellular and EV-associated miRNA pools could reveal whether specific miRNAs are preferentially exported, a behavior that should be inversely related to their cellular abundance if export serves a beneficial function by reducing intracellular levels. This comparison is vital to strengthen the biological relevance of the findings and support the proposed regulatory mechanisms by GPCRs.

      We appreciate the valuable suggestions from the reviewer. EV miRNA and cell miRNAs may exhibit distinct profiles as miRNAs can be selectively sorted into or excluded from EVs(Pultar et al., 2024; Teng et al., 2017; Zubkova et al., 2021). Investigating the difference between cellular miRNA levels and EV miRNA cargo would provide insight into the mechanism of miRNA sorting and the functions of miRNAs in the recipient cells. The expression of the cellular miRNAs is a highly dynamic process. To accurately compare the miRNA expression levels, profiling of EV miRNA and cellular miRNA should be conducted simultaneously. However, as a pilot study, we were unable to measure the cellular miRNAs without conducting the entire experiment again.

      Reviewer #2 (Public review):

      Summary:

      This study examines how activating specific G protein-coupled receptors (GPCRs) affects the microRNA (miRNA) profiles within extracellular vesicles (EVs). The authors seek to identify whether different GPCRs produce unique EV miRNA signatures and what these signatures could indicate about downstream cellular processes and pathological processes.

      Methods:

      (1) Used U2OS human osteosarcoma cells, which naturally express multiple GPCR types.

      (2) Stimulated four distinct GPCRs (ADORA1, HRH1, FZD4, ACKR3) using selective agonists.

      (3) Isolated EVs from culture media and characterized them via size exclusion chromatography, immunoblotting, and microscopy.

      (4) Employed qPCR-based miRNA profiling and bioinformatics analyses (e.g., KEGG, PPI networks) to interpret expression changes.

      Key Findings:

      (1) No significant change in EV quantity or size following GPCR activation.

      (2) Each GPCR triggered a distinct EV miRNA expression profile.

      (3) miRNAs differentially expressed post-stimulation were linked to pathways involved in cancer, insulin resistance, neurodegenerative diseases, and other physiological/pathological processes.

      (4) miRNAs such as miR-550a-5p, miR-502-3p, miR-137, and miR-422a emerged as major regulators following specific receptor activation.

      Conclusions:

      The study offers evidence that GPCR activation can regulate intercellular communication through miRNAs encapsulated within extracellular vesicles (EVs). This finding paves the way for innovative drug-targeting strategies and enhances understanding of drug side effects that are mediated via GPCR-related EV signaling.

      Strengths:

      (1) Innovative concept: The idea of linking GPCR signaling to EV miRNA content is novel and mechanistically important.

      (2) Robust methodology: The use of multiple validation methods (biochemical, biophysical, and statistical) lends credibility to the findings.

      (3) Relevance: GPCRs are major drug targets, and understanding off-target or systemic effects via EVs is highly valuable for pharmacology and medicine.

      Weaknesses:

      (1) Sample Size & Scope: The analysis included only four GPCRs. Expanding to more receptor types or additional cell lines would enhance the study's applicability.

      We are encouraged that the reviewer recognized the novelty, methodological rigor, and significance of our work. We recognize the limitations of our current model system and emphasize the need to test additional GPCR families and cell lines in the future studies, as detailed in the discussion section.

      (2) Exploratory Nature: This study is primarily descriptive and computational. It lacks functional validation, such as assessing phenotypic effects in recipient cells, which is acknowledged as a future step.

      We appreciate the feedback. We recognize the importance of validating the function of the top candidate miRNAs in the recipient cells, and this will be included in future studies. 

      (3) EV heterogeneity: The authors recognize that they did not distinguish EV subpopulations, potentially confounding the origin and function of miRNAs.

      Thank you for the comment. EV isolation and purification are major challenges in EV research. Current isolation techniques are often ineffective at separating vesicles produced by different biogenetic pathways. Furthermore, the lack of specific markers to differentiate EV subtypes adds to this complexity. We recognize that the presence of various subpopulations can complicate the interpretation of EV cargos. In our study, we used a combined approach of ultrafiltration followed by size-exclusion chromatography to achieve a balance between EV purity and yield. We adhere to the MISEV (Minimal Information for Studies of Extracellular Vesicles 2023) guidelines by reporting detailed isolation methods, assessing both positive and negative protein markers, and characterizing EVs by electron microscopy to confirm vesicle structure, as well as nanoparticle tracking analysis to verify particle size distribution(Welsh et al., 2024). By following these guidelines, we can ensure the quality of our study and enhance the ability to compare our findings with other studies.

    1. eLife Assessment

      In this important manuscript, Ryan et al perform a genome-wide CRISPR based screen to identify genes that modulate TDP-43 levels in neurons. They identify a number of genes and pathways and highlight the BORC complex, which is required for anterograde lysosome transport as one such regulator of TDP-43 protein levels. Overall, this is a convincing study, which opens the door for additional future investigations on the regulation of TDP-43.

    2. Reviewer #1 (Public review):

      Summary:

      As TDP-43 mislocalization is a hallmark of multiple neurodegenerative diseases, the authors seek to identify pathways that modulate TDP-43 levels. To do this, they use a FACS based genome wide CRISPR KD screen in a Halo tagged TDP-43 KI iPSC line. Their screen identifies a number of genetic modulators of TDP-43 expression including BORC which plays a role in lysosome transport.

      Strengths:

      Genome wide CRISPR based screen identifies a number of modulators of TDP-43 expression to generate hypotheses regarding RNA BP regulation and perhaps insights into disease

    3. Reviewer #2 (Public review):

      Summary:

      The authors employ a novel CRISPRi FACS screen and uncover the lysosomal transport complex BORC as a regulator of TDP-43 protein levels in iNeurons. They also find that BORC subunit knockouts impair lysosomal function, leading to slower protein turnover and implicating lysosomal activity in the regulation of TDP-43 levels. This is highly significant for the field given that a) other proteins could also be regulated in this way, b) understanding mechanisms that influence TDP-43 levels are significant given that its dysregulation is considered a major driver of several neurodegenerative diseases and c) the novelty of the proposed mechanism.

      Strengths:

      The novelty and information provided by the CRISPRi screen. The authors provide evidence indicating that BORC subunit knockouts impair lysosomal function, leading to slower protein turnover and implicating lysosomal activity in the regulation of TDP-43 levels and show a mechanistic link between lysosome mislocalization and TDP-43 dysregulation. The study highlights the importance of localized lysosome activity in axons and suggests that lysosomal dysfunction could drive TDP-43 pathologies associated with neurodegenerative diseases like FTD/ALS. Further, the methods and concepts will have an impact to the larger community as well. The work also sets up for further work to understand the somewhat paradoxical findings that even though the tagged TDP-43 protein is reduced in the screen, it does not alter cryptic exon splicing and there is a longer TDP-43 half-life with BORC KD.

    4. Reviewer #3 (Public review):

      Summary:

      In this work, Ryan et al. have performed a state-of-the-art full genome CRISP-based screen of iNEurons expressing a teggd version of TDP-43 in order to determine expression modifiers of this protein. Unexpectedly, using this approach the authors have uncovered a previously undescribed role of the BORC complex in affecting the levels of TDP-43 protein, but not mRNA expression. Taken together, these findings represent a very solid piece of work that will certainly be important for the field.

      Strengths:

      BORC is a novel TDP-43 expression modifier that has never been described before and it seemingly acts on regulating protein half life rather than transcriptome level. It has been long known that different labs have reported different half-lives for TDP-43 depending on the experimental system but no work has ever explained these discrepancies. Now, the work of Ryan et al. has for the time identified one of these factors which could account for these differences and play an important role in disease (although this is left to be determined in future studies).

      The genome wide CRISPR screening has demonstrated to yield novel results with high reproducibility and could eventually be used to search for expression modifiers of many other proteins involved in neurodegeneration or other diseases

    1. eLife Assessment

      Seon and Chung investigate changes in own risk-taking behavior, when they are being observed by a "risky" or "safe" player. Using computational modeling and model-informed fMRI, the authors present convincing evidence that participants adjust their choice congruent with the other player's type (either risky or safe). The conclusions of the paper are an important contribution to the field of social decision-making as they show a differentiated adjustment of choices and not just a universally riskier choice behavior when being observed as has been claimed in previous studies.

    2. Reviewer #2 (Public review):

      Summary:

      This study aims to investigate how social observation influences risky decision-making. Using a gambling task, the study explored how participants adjusted their risk-taking behavior when they believed their decisions were being observed by either a risk-averse or risk-seeking partner. The authors hypothesized that individuals would simulate the choices of their observers based on learned preferences and integrate these simulated choices into their own decision-making. In addition to behavioral experiments, the study employed computational modeling to formalize decision processes and fMRI to identify the neural underpinnings of risky decision-making under social observation.

      Strengths:

      The study provides a fresh perspective on social influence in decision-making, moving beyond the simple notion that social observation leads to uniformly riskier behavior. Instead, it shows that individuals adjust their choices depending on their beliefs about the observer's risk preferences, offering a more nuanced understanding of how social contexts shape decision-making. The authors provide evidence using comprehensive approaches, including behavioral data based on a well-designed task, computational modeling, and neuroimaging. The three models are well selected to compare at which level (e.g., computing utility, risk preference shift, and choice probability) the social influence alters one's risky decision-making. This approach allows for a more precise understanding of the cognitive processes underlying decision-making under social observation.

      Weaknesses:

      While the neuroimaging results are generally consistent with the behavioral and computational findings, the strength of the neural evidence could be improved. The authors' claims about the involvement of the TPJ and mPFC in integrating social information are plausible, but further analysis, such as model comparisons at the neuroimaging level, is needed to decisively rule out alternative interpretations that other computational models suggest.

      My concern raised above in the previous round has been addressed with the newly added results. I now find the manuscript substantially improved.

      I have only a minor suggestion: when discussing the conflict-related signals observed in the dACC and dlPFC, I encourage the authors to include alternative interpretations beyond conflict monitoring per se. For example, these signals may also reflect processes related to information updating during social learning or inference. While the study does not aim to dissociate these possibilities, acknowledging them would enrich the discussion and provide a broader perspective for readers.

      Comments on revised version:

      Thank you for the substantial revision. I believe the additional analyses have meaningfully strengthened the manuscript, particularly by improving the connection between the behavioral modeling and neuroimaging results. The findings are consistent with prior work while also providing novel insights.

      When discussing the conflict-related signals observed in the dACC/dlPFC, I encourage the authors to include alternative interpretations in addition to conflict monitoring per se. For example, these signals may also reflect processes related to information updating during social learning or inference. While the study does not aim to dissociate these possibilities, acknowledging them would enrich the discussion and offer a broader perspective for readers.

      I have updated my evaluation of the strength of evidence from Solid to Convincing.

    3. Reviewer #3 (Public review):

      Summary:

      This is an important paper using a novel paradigm to examine how observation affects social contagion of risk preferences. There is a lot of interest in the field on the mechanisms of social influence, and adding in the factor of whether observation also influences these contagion effects is intriguing.

      Strengths:

      There is an impressive combination of a multi-stage behavioural task as well as computational modelling and neuroimaging. The analyses are well conducted and the sample size is reasonable.

      Comments on revised version:

      Thank you for your helpful responses to my concerns. The manuscript is much improved and will make an important contribution to the literature. I have one remaining clarification. My request was for the authors to speculate in the discussion about lifespan differences in susceptibility to social influence, because the paper talks about how observing others' choices makes people riskier. I think it is important to explicitly acknowledge in the discussion that the sample tested was young adults, and it may be that the effects they observe are not the same in adolescents or older adults, as suggested in recent work (e.g. Reiter et al., 2019 Nat Comms, Su et al., 2024, Comms Psych). This is important to qualify general statements about how humans behave when observing others' risky decisions.

    1. eLife Assessment

      This fundamental study demonstrates that lipid binding can regulate the dimerization state of the SARS-CoV2 Orf9b protein. The data from biophysical and cellular experiments along with mathematical modeling are compelling. This paper is broadly relevant to those studying coupled equilibria across all aspects of biology.

    2. Reviewer #1 (Public review):

      Summary:

      Felipe and colleagues try to answer an important question in Sarbecovirus Orf9b-mediated interferon signaling suppression, given that this small viral protein adopts two distinct conformations, a dimeric β-sheet-rich fold and a helix-rich monomeric fold when bound by Tom70 protein. Two Orf9b structures determined by X-ray crystallography and Cryo-EM suggest an equilibrium between the two Orf9b conformations, and it is important to understand how this equilibrium relates to its functions. To answer these questions, the authors developed a series of ordinary differential equations (ODE) describing the Orf9b conformation equilibrium between homodimers and monomers binding to Tom70. They used SPR and a fluorescent polarization (FP) peptide displacement assay to identify parameters for the equilibrium and create a theoretical model. They then used the model to characterize the effect of lipid-binding and the effects of Orf9b mutations in homodimer stability, lipid binding, and dimer-monomer equilibrium. They used their model to further analyze dimerization, lipid binding, and Orf9b-Tom70 interactions for truncated Orf9b, Orf9b fusion mutant S53E (blocking Tom70 binding), and Orf9b from a set of Sars-CoV-2 VOCs. They evaluated the ability of different Orf9b variants for binding Tom70 using Co-IP experiments and assessed their activity in suppressing IFN signaling in cells.

      Overall, this work is well designed, the results are of high quality and well-presented; the results support their conclusions.

      Strengths:

      (1) They developed a working biophysical model for analyzing Orf9b monomer-dimer equilibrium and Tom70 binding based on SPR and FP experiments; this is an important tool for future investigation.

      (2) They prepared lipid-free Orf9b homodimer and determined its crystal structure.

      (3) They designed and purified obligate Orf9b monomer, fused-dimer, etc., a very important Orf9b variant for further investigations.

      (4) They identified the lipid bound by Orf9b homodimer using mass spectra data.

      (5) They proposed a working model of Orf9b-Tom70 equilibrium.

      Weaknesses:

      (1) It is difficult to understand why the obligate Orf9b dimer has similar IFN inhibition activity as the WT protein and obligate Orf9b monomer truncations.

      (2) The role of Orf9b homodimer and the role of Orf9b-bound lipid in virus infection, remains unknown.

      Comments on revisions:

      In the revised manuscript, the authors have addressed my concerns.

    3. Reviewer #2 (Public review):

      Summary:

      This study focuses on Orf9b, a SARS-COV1/2 protein that regulates innate signaling through interaction with Tom70. San Felipe et al use a combination of biophysical methods to characterize the coupling between lipid-binding, dimerization, conformational change, and protein-protein-interaction equilibria for the Orf9b-Tom70 system. Their analysis provides a detailed explanation for previous observations of Orf9b function. In a cellular context, they find other factors may also be important for the biological functioning of Orf9b.

      Strengths:

      San Felipe et al elegantly combine structural biology, biophysics, kinetic modelling, and cellular assays, allowing detailed analysis of the Orf9b-Tom70 system. Such complex systems involving coupled equilibria are prevalent in various aspects of biology, and a quantitative description of them, while challenging, provides a detailed understanding and prediction of biological outcomes. Using SPR to guide initial estimates of the rate constants for solution measurements is an interesting approach.

      Weaknesses:

      This study would benefit from a more quantitative description of uncertainties in the numerous rate constants of the models, either through a detailed presentation of the sensitivity analysis or another approach such as MCMC. Quantitative uncertainty analysis, such as MCMC is not trivial for ODEs, particularly when they involve many parameters and are to be fitted to numerous data points, as is the case for this study. The authors use sensitivity analysis as an alternative, however, the results of the sensitivity analysis are not presented in detail, and I believe the authors should consider whether there is a way to present this analysis more quantitatively. For example, could the residuals for each +/-10% parameter change for the peptide model be presented as a supplementary figure, and similarly for the more complex models? Further details of the range of rate constants tested would be useful, particularly for the ka and kB parameters.

      The authors build a model that incorporates an α-helix-β-sheet conformational change, but the rate constant for the conversion to the α-helix conformation is required to be second order. Although the authors provide some rationale, I do not find this satisfactorily convincing given the large number of adjustable parameters in the model and the use of manual model fitting. The authors should discuss whether there is any precedence for second-order rate constants for conformational changes in the literature. On page 14, the authors state this rate constant "had to be non-linear in the monomer β-sheet concentration" - how many other models did the authors explore? For example, would αT↔α↔αα↔ββ (i.e., conformational change before dimer dissociation) or α↔βαT↔ββ (i.e., Tom70 binding driving dimer dissociation) be other plausible models for the conformational change that do not require assumptions of second-order rate constants for the conformational change?

      Overall, this study progresses the analysis of coupled equilibria and provides insights into Orf9b function.

      Comments on revisions:

      The authors have done a satisfactory job addressing my concerns.

      Regarding my recommendations to the authors - point 7: "Orf9b-FITC:Tom70" and "PT", representing the same species, are still both used in the equations on page 14, which is confusing for anyone who may wish to re-use the model. I appreciate this is quite a subtle point but given the importance of the model for the manuscript I feel the authors should do their due diligence to ensure it is presented as clearly as possible.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      Felipe and colleagues try to answer an important question in Sarbecovirus Orf9b-mediated interferon signaling suppression, given that this small viral protein adopts two distinct conformations, a dimeric β-sheet-rich fold and a helix-rich monomeric fold when bound by Tom70 protein. Two Orf9b structures determined by X-ray crystallography and Cryo-EM suggest an equilibrium between the two Orf9b conformations, and it is important to understand how this equilibrium relates to its functions. To answer these questions, the authors developed a series of ordinary differential equations (ODE) describing the Orf9b conformation equilibrium between homodimers and monomers binding to Tom70. They used SPR and a fluorescent polarization (FP) peptide displacement assay to identify parameters for the equilibrium and create a theoretical model. They then used the model to characterize the effect of lipid-binding and the effects of Orf9b mutations in homodimer stability, lipid binding, and dimer-monomer equilibrium. They used their model to further analyze dimerization, lipid binding, and Orf9b-Tom70 interactions for truncated Orf9b, Orf9b fusion mutant S53E (blocking Tom70 binding), and Orf9b from a set of Sars-CoV-2 VOCs. They evaluated the ability of different Orf9b variants for binding Tom70 using Co-IP experiments and assessed their activity in suppressing IFN signaling in cells.

      Overall, this work is well designed, the results are of high quality and well-presented; the results support their conclusions.

      We thank reviewer #1 for their thoughtful assessment of our work and their constructive feedback.

      Strengths:

      (1) They developed a working biophysical model for analyzing Orf9b monomer-dimer equilibrium and Tom70 binding based on SPR and FP experiments; this is an important tool for future investigation.

      (2) They prepared lipid-free Orf9b homodimer and determined its crystal structure.

      (3) They designed and purified obligate Orf9b monomer, fused-dimer, etc., a very important Orf9b variant for further investigations.

      (4) They identified the lipid bound by Orf9b homodimer using mass spectra data.

      (5) They proposed a working model of Orf9b-Tom70 equilibrium.

      Weaknesses:

      (1) It is difficult to understand why the obligate Orf9b dimer has similar IFN inhibition activity as the WT protein and obligate Orf9b monomer truncations.

      We thank the reviewer for their observation and agree that the obligate homodimer IFN results were not what we expected to observe given our FP kinetic results with the purified obligate homodimer and noted our surprise in the discussion. We also note that we have two possible hypotheses for why this is the case.

      In our discussion, we noted the possible introduction of an increased avidity effect with fused homodimer and have improved it as follows with additions in red:

      “This result was unexpected as we had anticipated the obligate homodimer results to resemble the phosphomimetic. We hypothesize that this may be explained by two possible factors. First, we can’t exclude the introduction of an increased avidity between Orf9b and Tom70 when using the fused homodimer. Although our modeled decrease in the association rate of Orf9b:Tom70 (which increases the K<sub>D</sub> of the complex) suggests that fusing two copies of Orf9b decreases the affinity to Tom70, one copy of the fusion construct could also be capable of either binding to two copies of Tom70, or, one copy of the fusion could undergo rapid rebinding to Tom70. These effects would lead to a much tighter interaction in cellular assays than we modeled in vitro. A second possible explanation is that our assumptions about high lipid binding are not valid for cell based assays.”

      We also noted that a second possible explanation is due to our limitations in isolating the apo-fused homodimer to compare to the lipid-bound fused homodimer and possible differences this could have on our assays and briefly expanded upon this. Again, we improved this with additions in red:

      “As we have shown with both WT and fusion constructs, recombinantly expressed and purified Orf9b is lipid-bound and this can stabilize the homodimer to slow or inhibit the binding to Tom70. For the Orf9b fusion construct, we attempted to isolate the lipid-free species through protein refolding as previously described to compare the effect of lipid-binding on the homodimer fusion (similar to our WT experiments); however, we could not recover the stably folded homodimer. We hypothesize that the discrepancy between our kinetic results and Co-IP/IFN results could be due to subsaturation of the Orf9b fusion homodimers by lipids in cell based assays. While we have shown that lipid-binding occurs in recombinant expression systems, it is possible that in our cell based signaling assays that lipid-binding only affects a minor population of Orf9b. Given that we were unable to isolate the apo-fusion homodimer, we could not directly compare whether there are differences in fusion homodimer stability in the presence or absence of lipid-binding. Therefore, it is possible that the apo-fusion homodimer undergoes unfolding and refolding into alpha helices that lead to Tom70 binding similar to the WT construct.”

      (2) The role of Orf9b homodimer and the role of Orf9b-bound lipid in virus infection, remains unknown.

      We agree that we did not try to directly test for the role of the homodimer during infection and this remains an open area of exploration for future studies. We have included this caveat in our discussion but suggested possible experiments and future directions that could help shed light on this:

      “Although we have not directly tested for the role the homodimer conformation plays during infection, we have demonstrated that lipid-binding to the homodimer can bias the equilibrium away from Tom70. Lipids including palmitate have been shown to act as both a signaling molecule as well as a post-translational modification during antiviral innate immune signaling (S Mesquita et al. 2024; Wen et al. 2022; S. Yang et al. 2019). As a post-translational modification (referred to as S-acylation), MAVS, a mitochondrial type 1 IFN signaling protein that associates with Tom70 (X.-Y. Liu et al. 2010; McWhirter, Tenoever, and Maniatis 2005; Seth et al. 2005), has been shown to be post-translationally palmitoylated which affects its ability to localize to the mitochondrial outer membrane during viral infection and is a known target of Orf9b (Bu et al. 2024; Lee et al. 2024). When this is impaired (either by mutation or by depletion of the palmitoylation enzyme ZDHHC24), IFN activation is impaired (Bu et al. 2024). Therefore, future investigations should consider if the homodimer conformation of Orf9b is capable of antagonizing other IFN signaling factors such as MAVS by binding to palmitoyl groups. Indeed, Orf9b has already been shown to be capable of binding to MAVS by Co-IP (Han et al. 2021), however, whether or not this occurs through the palmitoyl modification remains unknown.”

      Reviewer #2 (Public review):

      Summary:

      This study focuses on Orf9b, a SARS-COV1/2 protein that regulates innate signaling through interaction with Tom70. San Felipe et al use a combination of biophysical methods to characterize the coupling between lipid-binding, dimerization, conformational change, and protein-protein-interaction equilibria for the Orf9b-Tom70 system. Their analysis provides a detailed explanation for previous observations of Orf9b function. In a cellular context, they find other factors may also be important for the biological functioning of Orf9b.

      Strengths:

      San Felipe et al elegantly combine structural biology, biophysics, kinetic modelling, and cellular assays, allowing detailed analysis of the Orf9b-Tom70 system. Such complex systems involving coupled equilibria are prevalent in various aspects of biology, and a quantitative description of them, while challenging, provides a detailed understanding and prediction of biological outcomes. Using SPR to guide initial estimates of the rate constants for solution measurements is an interesting approach.

      Weaknesses:

      This study would benefit from a more quantitative description of uncertainties in the numerous rate constants of the models, either through a detailed presentation of the sensitivity analysis or another approach such as MCMC. Quantitative uncertainty analysis, such as MCMC is not trivial for ODEs, particularly when they involve many parameters and are to be fitted to numerous data points, as is the case for this study. The authors use sensitivity analysis as an alternative, however, the results of the sensitivity analysis are not presented in detail, and I believe the authors should consider whether there is a way to present this analysis more quantitatively. For example, could the residuals for each +/-10% parameter change for the peptide model be presented as a supplementary figure, and similarly for the more complex models? Further details of the range of rate constants tested would be useful, particularly for the ka and kB parameters.

      We thank the reviewer for their constructive feedback and have generated supplemental figures providing a deeper analysis of the residuals for each model parameter adjusted +/- 10% from the reported values which we have added to our supplemental figures as Figure 1 - Supplemental 3 and Figure 4 - Supplemental 5  .

      We note that there are modest improvements in residual plots where model parameters are individually lowered by 10% from their reported value when considering this single dataset, however, our choice of using the reported values was driven by finding values that were suitable for improving model behavior across multiple concentration series in different datasets. Specifically, we have also included the RMSD values for each model parameter subjected to a +/-10% change from a single concentration time course as well as the percent change in RMSD relative to the RMSD generated by our reported model parameters to illustrate this. We have also included text that makes note of the observed pattern in the residuals from Figure 4 - Supplement 5 and provided some explanations for why this may occur.

      “Inspection of the residuals from the 5uM apo-Orf9b homodimer time course showed clear patterns when individual model parameters were subjected to a 10% increase or decrease from the reported values. While our proposed model qualitatively describes the concentration dependent change in kinetic behavior, the residual plots may suggest that additional binding reactions may also be occurring that are not captured by our model.”

      Figure 1 - Supplemental 3. Plots of residuals from Orf9b peptide model showing effect of an increase or decrease by 10% on each model parameter. All residuals and reporting are with respect to the100uM of unlabeled Orf9b peptide condition. Blue dots: reported value. Red dots: 10% increase in reported value. Green dots: 10% decrease in reported value. Table reporting of RMSD values for model fitsafter +/-10% change to model parameter (Left column) and percent change in RMSD relative to reported model RMSD (Right column).

      “As an alternative to attempting to place CIs on the parameters, we performed sensitivity analysis to determine which parameters the model was most sensitive to (see methods and Figure 1 - Supplemental 3). Additionally, we note that the model parameters were derived from the fit of only one concentration (100uM), but fit the other concentrations equally well. We observed that the model parameter that was most sensitive to change was the rate of Orf9b-FITC:Tom70 ([PT]) dissociation when subjected to a 10% increase or decrease whereas all other model parameters showed no sensitivity to change (Figure 1 - Supplemental 3).”

      Figure 4 - Supplemental 5: Plot of residuals showing the effect of increasing or decreasing individual model parameters 10% compared to the reported values. All residual plots are with respect to the 5uM apo-Orf9b homodimer condition. Blue dots: reported value. Red dot: 10% increase in reported value. Green dot: 10% decrease in reported value. (Left columns) Table of RMSD values calculated from model fits showing the effect of both +/-10% change to individual model parameters. (Right columns) Percent change in RMSD values subjected to +/-10% change for individual model parameters relative to the RMSD of the reported model.

      We have also included the following revised text to accompany this figure.

      “Further, we repeated the sensitivity analysis described previously for the peptide model and also considered the sensitivity of model parameters by inspecting each individually (Figure 4- figure supplemental 5). We found that when examining the residuals of the lowest concentration of 5uM, the model was most sensitive to changes in three parameters: the rate of homodimer association and dissociation and the conversion from β to α-monomers.”

      “Therefore, under low concentrations of Orf9b homodimer, binding to Tom70 is limited by the rate of homodimer association and dissociation as well as the conversion of Orf9b monomers to the α-helical conformation.”

      We have also included a supplemental figure showing how changes in the model parameters ka and kB affect the models behavior to help illustrate the range of values tested as Figure 4 - Supplemental 4.

      Figure 4 - Supplemental 4: Plots of model behavior showing the effect of changes to alpha-beta and beta-alpha monomer  interconversion rates compared to experimental values. Data is modeled with respect to the apo-Orf9b homodimer 5uM condition. Black line represents reported model fit and values used.

      We have also incorporated the following revised text.

      “The model parameters k<sub>a</sub> and k<sub>B</sub> describe the rate of interchange between the β-sheet and α-helix monomer conformations. These parameters must be estimated by modeling because our assays do not allow us to directly measure the folding rates between these conformations. To identify these values, we performed a scan of k<sub>a</sub> and k<sub>B</sub> values that yielded the best agreement between the model and the experimental conditions (Figure 4 - figure supplemental 4).”

      The authors build a model that incorporates an α-helix-β-sheet conformational change, but the rate constant for the conversion to the α-helix conformation is required to be second order. Although the authors provide some rationale, I do not find this satisfactorily convincing given the large number of adjustable parameters in the model and the use of manual model fitting. The authors should discuss whether there is any precedence for second-order rate constants for conformational changes in the literature. On page 14, the authors state this rate constant "had to be non-linear in the monomer β-sheet concentration" - how many other models did the authors explore? For example, would αT↔α↔αα↔ββ (i.e., conformational change before dimer dissociation) or α↔βαT↔ββ (i.e., Tom70 binding driving dimer dissociation) be other plausible models for the conformational change that do not require assumptions of second-order rate constants for the conformational change?

      We thank the reviewer for their feedback. During our studies, we tested several models prior to the final one presented in Figure 4A. The first model that we tested as described in Figure 4 - Supplemental 3 described ββ↔α↔αT with no conformational change. We tested several models that integrated the existing structural data for both Orf9b and Tom70 and found that while these models could fit individual time series, they did not explain the concentration dependent changes in subsequent time series nor did they explain changes induced by lipid-binding and mutations in VOC.

      With respect to the possibilities of αT↔α↔αα↔ββ and α↔βαT↔ββ models, we have revised our manuscript to mention that we did test additional models before we settled on the model that we presented.

      “We tested different reaction schemes that incorporated the interconversion between β-sheet to α-helix conformations by considering models that described a conformational change in the homodimer leading to Tom70 binding rather than monomers. None of these models adequately described our experimental results, therefore we continued developing our model as outlined in Figure 4D”

      With respect to the second-order rate describing the fold change from β to α, we have added the revised text to the manuscript:

      “We initially tested the impact of keeping the rate constant k<sub>a</sub> first order, just like k<sub>B</sub> which did yield the sigmoidal behavior we observed in the 5uM apo-homodimer condition. However, this assumption failed to describe the data at other concentrations resulting in a substantial overestimation compared to our experimental results when holding k<sub>B</sub> at a constant value throughout. We found that when the β-sheet to α-helix rate (k<sub>a</sub> ) was made a second order rate constant, we were able to hold the rate constant across all concentrations tested suggesting a non-linearity in the monomer β-sheet concentration.”

      While this was surprising to us, we reasoned that a biological explanation for why the conversion from β to α was second order was that the β-monomers may transiently self-associate to cooperatively fold into the α-helical conformation. We did acknowledge this choice to make the β to α parameter non-linear (unlike the α to β conversion which was single order).

      We concede that we could not find specific examples describing non-linear kinetics comparable to the system we described in literature, however, such systems have been reported for proteins that exhibit high structural plasticity where transient interactions with another copy of the protein or another protein altogether drive folding changes and we have revised this manuscript to include some additional citations to papers that describe such systems (Zuber et al. 2022; Tuinstra et al. 2008).

      Overall, this study progresses the analysis of coupled equilibria and provides insights into Orf9b function.

      Reviewer #1 (Recommendations for the authors):

      (1) What was the unlabeled Orf9b peptide is added to the pre-equilibrated Orf9b-FITC:Tom70 solution as a competitor? Figure 1D illustrates that the competitor was full-length Orf9b.

      We have revised the figure to illustrate that in this experiment, the competitor is the unlabeled FITC peptide and not the full length Orf9b sequence

      (2) Figure 2B, what is the higher Mw peak from refolded Orf9b homodimer.

      We have added the following revised text (highlighted in red) to the manuscript to clarify Figure 2B.

      “The SEC elution profile and retention volume of refolded Orf9b directly overlapped with natively folded homodimeric Orf9b and suggested a high recovery of the refolded homodimer with the early eluting peaks corresponding to either a chaperone-bound species (natively folded) or misfolded protein (refolded) as judged by SDS-PAGE (Figure 2B). Together, the overlap in elution peaks corresponding to the folded homodimer suggested a high recovery of the homodimer from the refolding conditions.”

      (3) Figure 2C, in the main text, the authors state that "...observed that the refolded homodimer structure closely aligned with the lipid-bound reference structure, which shows that the homodimer fold can be recovered after denaturing". Please provide structural comparison details here, software used? Rmsd and Dali Z-score.

      We have added the following revised text (highlighted in red) to the manuscript to clarify Figure 2C.

      “Aligning the structure of the Orf9b homodimer (PDB 6Z4U) with our structure of the refolded Orf9b homodimer (9N55) in Pymol resulted in an RMSD of 1.1Å. Further, we also searched our structures of the refolded Orf9b homodimer on the Dali server against the existing structures of the lipid-bound Orf9b homodimer which yielded a Z-score of 2.2 which shows good correspondence between the structures.”

      (4) To prove the refolded Orf9b homodimer did not contain lipid, could the authors provide mass spectra data for the refolded Orf9b sample and compare it with the results in Figure 2 - Supplemental 1.

      We do not have complete mass spectra data for the refolded homodimer samples, however, we feel that the native mass spectrometry data provides a good orthogonal comparison between natively folded and refolded samples for the presence or absence of lipids. We concede that we only used mass spectrometry to characterize the four peaks that were unique to the natively folded deconvoluted spectra which confirmed that shift in mass relative to the expected homodimer molecular weight corresponded to the two lipids we presented. However, we would expect that performing mass spectrometry on the refolded sample would only further confirm our observations from the crystal structures and the native mass spectrometry.

      (5) Have the authors tried to use analytical ultracentrifugation to analyze the Orf9b dimer-monomer equilibrium, given that AUC provides a much more accurate measurement of molecular mass?

      We thank the reviewer for this suggestion and agree that AUC could be an additional useful strategy for monitoring the dimer-monomer equilibrium and provide additional validation of the molecule weights of both the monomer and homodimer.

      While we have not performed AUC, we have revised our manuscript to include more discussion about the determination of molecular weights by SEC.

      “For the Orf9b homodimer, the retention volume was consistent with molecular weight standards based on the expected molecular weight of the homodimer (~21kDa) and the standard (~29kDa). In the case of the Orf9b monomer, although we would expect the retention volume of the monomer (~10.6kDA) to be between the molecular weight standards of 13.4kDa and 6.5kDa, the greater retention volume could be explained by non-specific hydrophobic interactions between the monomeric Orf9b and the column.”

      (6) The authors used truncation of 7 C-terminal amino acids to generate an obligate Orf9b monomer for their assays. It would be interesting to mutate residues at the homodimer interface to generate Orf9b monomers rather than deleting residues. For example, mutate 91-96aa (FVVVTV) to negatively charged residues, which will not only disrupt the dimerization interface, but also impair lipid binding. The dimer interface mutant should then be tested in their SPR, FP assays, as well as IFN inhibition assays.

      We thank the reviewer for their suggestion and agree that mutation of the 7 C-terminal amino acids into negatively charged residues could be an interesting alternative strategy to generating an obligate Orf9b monomer without the need for truncating the residues. Our choice of using the truncated construct we proposed was driven by our analysis of the structure of the homodimer which reveals that a significant portion of the dimer interface is composed of backbone-backbone hydrogen bonding between the two chains of Orf9b. We reasoned that truncating these residues would be the most effective way to compromise the interface between the two chains and drive a predominantly monomeric behavior, however, compromising the interface with multiple mutations is an intriguing alternative.

      Reviewer #2 (Recommendations for the authors):

      (1) The authors could comment on the slow monomer-dimer exchange observed by SEC and how it fits with their other analysis.

      We thank the reviewer for their comment and concede that the slow exchange may be a limitation of this experimental setup. Our observations from our SPR experiments and modeling showed us that the homodimer may be fast to dissociate into monomer given the off rate which would suggest a half-life for the homodimer to be on the order of seconds, however, we still observe a noticeable dimer species on the chromatograms. We initially allowed the diluted samples to reach equilibrium prior to injection onto the analytical sizing column, however, it is possible that the system is still in a pre-equilibrium prior to injection onto the column. This could be driven by interactions between the protein and the column that prevents full dissociation of the homodimer. While this is a limitation, we note that we did not use the Kd value that we determined by non-linear regression fitting to the equilibrium observed on the chromatograms for downstream experiments but instead used the value to get a ballpark estimate for the homodimer Kd which is on the same order as the Kd determined by SPR.

      (2) It might be useful to include the rate constants on the reaction arrows of the schematic representation of the models.

      We have revised Figure 4D to include the rates for both Orf9b monomer binding to Tom70 and Orf9b binding to Orf9b as derived from the SPR experiments as well as the modeled values for the interconversion between α and β monomers. We also revised Figure 7 to include these values as well as the modeled dissociation rate for homodimer when lipid-bound.

      (3) I couldn't find how the sensitivity analysis was performed for the more complex models. Was this the same +/- 10% as per the peptide model?

      We used the same +/- 10% sensitivity analysis for the peptide model in the more complex equilibrium model and have revised our manuscript to clearly reflect that.

      (4) Further clarification of "inspection of residuals suggested that the fits were accurate". In Figure 1B, the residues look to have systematic errors, perhaps indicating other processes occurring.

      We agree that in the SPR kinetic fitting results for the Orf9b peptide binding to Tom70 in Figure 1B that there are some regions where the fit over or under estimates the experimental results. This is partially the result of limitations in the number of different binding models that we can fit in the analysis software which is why we reported using a 1:1 langmuir binding model. It is certainly possible that there may be some additional binding reactions that occur, however, we limited our use of these specific kinetic results to the peptide model that we proposed in Figure 1D. We did note in the manuscript text that it was necessary for us to change the model parameter values to some extent in order to fit our experimental results which may be partially explained by the SPR fitting errors.

      “With the parameter set obtained from the 100µM condition, we then held all parameters fixed and simply changed the peptide concentrations in the model to fit the remaining conditions by hand. We note that this process saw the model parameter values change between 3% at the lowest end up to 70% at the highest end from the experimentally derived values but remained within an order of magnitude of the experimental SPR values. We speculate that this arises due to the differences in experimental setup between SPR and FP-based methods of measuring kinetics.”

      (5) The manuscript builds logically, but given the sophisticated nature of the system and the modelling could benefit from more clarity/streamlining in the descriptions/illustrations.

      We have revised our manuscript in response to both reviewers comments and hope that the clarity of the work is improved as a result.

      (6) Figure 4 Supplement 3 - where did the rate constants for Model 1 come from? Was there any attempt to alter them to fit the data better?

      We have clarified in the figure description that the rate constants used in Model 1 were the same values used in Figure 4B (but without the interconversion between beta and alpha rates).

      “Comparison of kinetic model 1 and 2 in describing experimental results from the kinetic binding assay. Experimental results using 10uM of refolded Orf9b homodimer are shown as rings with the predicted behavior of model 1 (equilibrium exchange) shown as a dark blue line. The predicted behavior of model 2 (equilibrium exchange with a conformational change between β-sheet and ɑ-helical monomers) is shown as the light blue line. Model parameter values were the same as described in Figure 4D and kept constant in both model comparisons.”

      (7) What are and [PT] in the second set of equations (page 13)?

      [‘PT] refers to the concentration of “fluorescent probe” (Orf9b-FITC) and Tom70.

      (8) "Additionally, the fused homodimer association rate (which can be viewed as a rate of tertiary complex formation)" - can the authors provide a mathematical proof for this?

      In the case of the fused homodimer kinetic data, we did not develop a separate model to explicitly take into account the differences between using a fused construct versus the WT construct that can dissociate into monomers. We have clarified our interpretation of this in the manuscript.

      “Although our model explicitly describes homodimer dissociation into monomers as a requisite step for Orf9b binding to Tom70, we adapted it for the fusion experimental data. In this case, all model parameters other than the association and dissociation kinetics of the fluorescent probe and Tom70 were adjusted to achieve the best agreement with the experimental data. When applied to the fusion homodimer, the parameters describing homodimer dissociation into separate monomers could instead describe the dissociation of the two β-sheet domains away from each other in the tertiary structure but remaining physically linked through the linker region.”

      (9) "For Lambda and Omicron, the P10S mutation results in the serine being positioned to form several hydrogen bonds between R13 and the backbone carbonyl of A11 and L48 within the same chain..." is this taken from AlphaFold predicted structures of the mutants? If so, it should be made clear that this is derived from predicted structures. And even so, AlphaFold can be poor at determining structures of mutants, and so there is greater uncertainty in the prediction of the bonds.

      For Lambda, Omicron, and Delta mutations, we used Pymol to examine how the placement of mutations could structurally explain the kinetic differences we observed in our model. We have gone back and clarified in the figure description that these predictions are not derived from AlphaFold.

      (10) "biological replicates" - is this different protein purifications?

      Yes, in this case biological replicates refer to different protein purifications for all variants described and tested.

      (11) Are any of the authors involved in the Berkeley Madonna commercial software used in the manuscript? If so, should this be in the conflict of interest statement?

      Yes, Michael Grabe is an owner of Berkeley Madonna, and we have updated our conflicts of interest statement to reflect this.

    1. eLife Assessment

      This important work describes a set of parameters that give a robust description of shape features of cells in tissues. The evidence for the usefulness of these parameters is solid. The work should be of interest for anybody analyzing epithelial dynamics, but more details about the analysis of experimental images are necessary and some streamlining of the text would increase the accessibility of the material for non-specialists.

    2. Reviewer #1 (Public review):

      Summary:

      The authors stated aim is to introduce so-called Minkowski tensors to characterize and quantify the shape of cells in tissues. The authors introduce Minkowski tensors and then define the p-atic order q_p as a cell shape measure, where p is an integer. They also introduce a previously defined measure of p-atic order in the form of the parameter \gamma_p. The authors compute q_p for data obtained by simulating an active vertex model and a multiphase field model, where they focus on p=2 and p=6 - so-called nematic and hexatic order - as the two values of highest biological relevance. Based on their analysis, the authors state that q_2 and q_6 are independent, that there is no crossover for the coarse-grained quantities, that a comparison of q_p for different values of p is not meaningful, and determine the dependence of the mean value of q_2 and q_6 on cell activity and deformability. Subsequently, they apply their method to data from MDCK monolayers and argue that the full range of q_p values needs to be considered to characterize shape and positional order in epithelia..

      Strength:

      The work presents a set of parameters that are useful for analyzing cell shape.

      Weaknesses:

      The introduction of the Minkowski tensors is hardly accessible for typical biologists. Eventually, most quantification is done using q_p, which can be defined without recursion to Minkowski functionals. The relation to Minkowski functionals makes the important properties of robustness and stability evident. However, for an audience of biologists, the derivation of this property could be relegated to an Appendix. Instead, the text could directly go to the results of the analysis of experimental and modeling data.

      Important details about how the cell shapes are extracted from the experimental data are missing. The two data sets the authors consider are not analyzed in the same way.

    3. Reviewer #3 (Public review):

      Hapel et al. present an article entitled Quantifying the shape of cells - from Minkowski tensors to p-atic order. The paper reports the p-atic quantitative method - established in physics - to extract cell full shapes in biological experiments using their images of epithelial MDCK cells (phase contrast) and also images reported in another paper as well as their own simulations based on active vertex model and multiphase phase fields approaches. Authors present the rationale of this new strategy for quantification. They adapt the method of Minkowski tensors and they extract distributions of cell shapes readouts with plots of their distributions. An emphasis is given to changes in cell shapes captured by this method. Higher rank tensors are considered as well as representations with intuitive meanings and q_i orders and their potential correlations or absence of correlations - for example q_2 and q_6, leading to statements about nematic and hexatic orders.

      This analysis and its strength are contrasted with Armengol-Collade et al. (2023) quoted in the paper, who consider polygonal shapes for cells and their shape function 𝛾_p. Authors support the notion of a key improvement thanks to Minkowski tensors approach and doing so, they challenge the former crossovers correlations statements reported in Armengol-Collade et al. (2023). In this context, they defend that nematic liquid crystals approach is not sufficient to capture cell dynamics in tissues. Also they propose that q_2 and q_6 could serve as readout for activity and deformability of cells among other statements related to their approach.

      A variety of analytical methods have been realised to track cells in monolayers in vitro and in vivo during morphogenesis - for example, shear decomposition (from MPI-PKS Dresden) or links joining centroids and their neighbours approach (MSC/Curie Paris) to name few examples. It will be interesting in the future that systematic comparisons between these analytical methods are performed with highlights on their respective advantages and drawbacks. This will allow experimentalists to identify the best relevant methods to address their morphogenetic questions.

    1. eLife Assessment

      The study provides valuable technical advances to generate and isolate neural rosettes. The technique is robust, as indicated by both reviewers. The evidence is solid, as shown in orthogonal characterization by flow cytometry, morphology, and scRNA-seq. Comparison with the manual-rosette-picking protocol will enhance the validity of the claims.

    2. Reviewer #1 (Public review):

      Summary:

      The authors aimed to develop a fully scalable, feeder-free protocol for deriving dorsal forebrain neural rosette stem cells (NRSCs) from human pluripotent stem cells, eliminating the need for manual rosette isolation. Using dynamic suspension culture combined with single-SMAD inhibition (RepSox), they sought to generate FOXG1⁺/OTX2⁺ NRSCs within ten days and expand them through at least twelve passages while retaining regional identity. They also aimed to demonstrate the cells' capacity to differentiate into functional neurons, astrocytes, and oligodendrocytes under defined conditions.

      Strengths:

      A key strength is the elimination of labour-intensive manual rosette picking, which significantly reduces operator variability and enhances throughput. The authors provide diverse validation in the form of flow cytometry showing >95% OTX2⁺ over passages 2-12, immunocytochemistry, single-cell RNA-seq, and functional MEA recordings, confirming both regional fidelity and neuronal activity. They also demonstrate glial differentiation and reproducibility across two hESC lines.

      The results convincingly demonstrate that the RepSox/suspension approach yields high-purity dorsal forebrain neural progenitor cells (NRSCs) that maintain marker expression and multipotency through passage 12 and differentiate into electrophysiologically active neurons and mature glia. Thus, the authors have achieved their primary objectives.

      This protocol addresses a significant bottleneck in neural stem cell production by providing a reproducible, high-throughput alternative that is well-suited to drug screening, disease modelling, and potential cell therapy manufacturing. Standardised, scalable NRSC banks will accelerate neurodevelopmental and neurodegenerative disorder studies, enable automated bioreactor workflows, and encourage the sharing of resources across academia and industry.

      Weaknesses:

      Weaknesses include a lack of direct comparison to conventional manual-selection protocols, and the need to improve the statistical rigor of all quantitative assays by applying appropriate hypothesis tests (e.g., t-tests or ANOVA with multiple-comparison correction) rather than reporting mean {plus minus} SD alone.

      Additional Context:

      Beyond the core technical advance, it's important to situate this work within the broader landscape of neural stem cell research and its downstream applications. Traditionally, dorsal forebrain NSCs have been generated via manual rosette picking after dual-SMAD inhibition (Chambers et al., 2009), a process that is labor-intensive, low-throughput, and prone to operator-dependent variability. By eliminating that step, this protocol directly addresses a key barrier to standardizing NSC production under GMP-compatible conditions - critical for both large-scale drug screening and eventual clinical use. Stable, regionally specified forebrain NSCs are especially valuable for modeling early neurodevelopmental disorders (e.g., autism spectrum disorders, microcephaly) and late-onset pathologies (e.g., Alzheimer's disease) in vitro, where precise cortical patterning is essential to recapitulate disease phenotypes. Moreover, establishing long-term epigenetic fidelity (e.g., via future ATAC-seq or histone-mark profiling) will further reassure users that transcriptional consistency reflects preserved regulatory networks, not just transient marker expression. Finally, demonstrating robust cryopreservation viability (>80%) makes these cells a readily shareable resource for the community, accelerating cross-lab reproducibility and comparative studies of patient-derived iPSC lines. This context underscores how scalable, high-purity forebrain NSCs can transform both basic neuroscience research and translational pipelines.

    3. Reviewer #2 (Public review):

      In the present manuscript, Dannulat Frazier et al. provide a novel and advanced protocol for obtaining almost pure populations of neural rosette stem cells (NRSCs) expressing the general markers NES and SOX2. These NSCs are expandable and exhibit dorsal forebrain properties and markers that are maintained throughout passages in culture (at least until passage 12). The authors also demonstrate the multipotency of these NSCs by their ability to differentiate into functional neurons, and precursors of astrocytes and oligodendrocytes.

      This method does not require the usual step of manual rosette selection and allows a greater homogeneity of the NSCs obtained and the standardization of the protocol, which will allow greater advances in the applications of these NSCs in research and as models of disease or compound testing. The manuscript is of great interest for the research area, since it describes a new methodology that can facilitate the research and therapeutic application of NSCs.

      The manuscript is well-written; the results are clear, robust, and well-explained. The conclusions reached in this paper are well-supported by the data, but some aspects could be better clarified.

      (1) The results presented in the present manuscript of the NSCS are performed up to passage 12; it would be interesting to know up to which passages these cells can be expanded, maintaining their initial properties. Have the authors analyzed passages beyond 12?

      (2) In Figure 2A, where different markers are shown in NSCs at different passages, it seems that at passage 12, there is a decrease in TJP1+ zones in relation to earlier passages, which could indicate a reduction in the potential to generate rosettes. Have the authors done any quantification along these lines? Could this be the case, or is it just an effect of the image chosen?

      (3) In Figure 3A, it is very striking and intriguing that the decrease in the expression of the PAX6 gene in passage 8 in relation to passage 2, which does not correspond to what is observed at the protein level. Have the authors verified this result using another technique, such as for example RT-q-PCR?

      (4) In Figure 5B, the labeling for GFAP, appears rather nuclear, despite being a cytoskeleton protein. How can the authors explain this?

    4. Author response:

      Reviewer #1 (Public review):

      Thank you for your thoughtful and constructive feedback on our manuscript. We greatly appreciate your insights regarding our work, as they are invaluable in refining our research.

      We are very happy to hear that you recognize the strengths of our method, particularly the elimination of manual rosette picking, which significantly enhances throughput and reduces variability. We are also pleased that our validation efforts—through flow cytometry, immunocytochemistry, single-cell RNA-sequencing, and functional MEA recordings—effectively demonstrate both the identity and functionality of our derived dorsal forebrain neural rosette stem cells (NRSCs).

      Regarding the identified weaknesses, we agree that a direct comparison with conventional manual-selection protocols, specifically those utilizing dual-SMAD inhibition, would be a significant improvement. To address this, we have initiated additional experiments that will directly compare our single-SMAD inhibition approach (RepSox) with dual-SMAD inhibition (SB/LDN), aiming for a comprehensive evaluation of both protocols.

      In terms of statistical rigor, we appreciate your suggestion on improving our quantitative assays. All data were collected from at least three independent experiments and presented as mean ±standard deviation unless otherwise specified. Due to the qualitative nature of the data, no formal statistical tests were performed for most of the experiments and the mean and standard deviation were calculated for some quantitative measurements obtained, providing a descriptive summary of the data. When possible, we will incorporate appropriate statistical tests, to present our data in a more robust manner, rather than merely reporting mean ± SD.

      Finally, we recognize the importance of situating our work within the broader landscape of neural stem cell research. We aim to elucidate the potential downstream applications for our protocol, which we believe will significantly impact neurodevelopmental and neurodegenerative disorder studies.

      Thank you again for your valuable suggestions. We look forward to refining our manuscript and enhancing the contribution of our research to the field.

      Reviewer #2 (Public review):

      Thank you for your thoughtful and constructive feedback on our manuscript. We appreciate your recognition of the novelty and potential impact of our protocol for obtaining neural rosette stem cells (NRSCs). Your comments are invaluable in improving our work.

      We are pleased that you found our methodology to be a significant advancement in the field, particularly the elimination of the manual rosette selection step, which hopefully will enhance homogeneity and standardization. We agree that this development has implications for research, disease modelling, and compound testing.

      Regarding your specific points:

      Passage expansion: Thank you for your insightful suggestion regarding the analysis beyond passage 12. We have continued passaging our NRSC line for more than 12 passages while maintaining the rosette structure. Although we do not yet have comprehensive and detailed analyses at these later passages, we will include some data and relevant information on our findings in the revised manuscript.

      TJP1+ zones: We appreciate your observation regarding the decreased TJP1+ zones at passage 12. We have not consistently detected a reduction in the number of rosettes or TJP1+ lumens across our cultures between passages. While some variability has been noted, we occasionally observe minor reductions at specific time points, followed by a recovery of rosettes in subsequent passages. This suggests that monitoring the number of rosettes is indeed a useful indicator of cell culture health. Cultures should be discarded if rosettes are completely lost. We will take a closer look at this aspect and report the findings in the revised manuscript.

      PAX6 Gene expression verification: Thank you for highlighting the discrepancy between PAX6 gene expression levels and protein levels. Unfortunately, we have not yet validated these results using an alternative technique. One potential explanation for this discrepancy may be the phenomenon of negative autoregulation, where increased levels of PAX6 protein can inhibit its own mRNA expression (Manuel et al., 2007). Moreover, Hsieh and Yang (2009) observed that during neurogenesis, PAX6 protein levels may not correlate linearly with mRNA levels, particularly in variable cellular environments. Additionally, post-transcriptional regulatory mechanisms, such as translation initiation mediated by Internal Ribosome Entry Sites (IRES), have been documented in various contexts involving PAX6, suggesting that mRNA levels may not fully represent functional protein levels in developing tissues (Li et al., 2023). We will go deeper into this discussion in the revised manuscript.

      GFAP Labeling: We appreciate your comments regarding the nuclear labeling of GFAP. In our astrocyte cultures, we have indeed observed GFAP localization in both the nucleus and the cytoplasm (Figure 5B). We will investigate this phenomenon further and provide a clearer explanation, supported by relevant literature, in the revised version. Although GFAP is primarily categorized as an intermediate filament protein localized in the cytoplasm, evidence suggests its nuclear localization may indicate additional regulatory roles during astrocyte development, activation, and pathology. This finding highlights the potential complexity of GFAP's role during fetal development and cellular stress, suggesting a broader functional scope that may extend into the nuclear space.

      Once again, thank you for your insightful feedback and for recognizing the potential of our research. We are committed to addressing your comments and enhancing the quality of our manuscript.

      Manuel, M. et al. (2007) ‘Controlled overexpression of Pax6 in vivo negatively autoregulates the Pax6 locus, causing cell-autonomous defects of late cortical progenitor proliferation with little effect on cortical arealization’, Development, 134(3), pp. 545–555. Available at: https://doi.org/10.1242/dev.02764.

      Hsieh, Y.-W. and Yang, X.-J. (2009) ‘Dynamic Pax6 expression during the neurogenic cell cycle influences proliferation and cell fate choices of retinal progenitors’, Neural Development, 4(1), p. 32. Available at: https://doi.org/10.1186/1749-8104-4-32.

      Li, Q. et al. (2023) ‘Translation of paired box 6 (PAX6) mRNA is IRES-mediated and inhibited by cymarin in breast cancer cells’, Genes & Genetic Systems, 98(4), pp. 161–169. Available at: https://doi.org/10.1266/ggs.23-00039.

    1. eLife Assessment

      The authors collected valuable time-course RNA-seq data from four tree species in natural environments and analyzed seasonal patterns of gene expression. The genome assemblies and gene expression data across multiple species and tissues are convincing, but the overarching conclusions are inadequately supported due to weaknesses in the study design, which encompasses three different environments and two distinct time periods. This makes it impossible to disentangle genetic effects - which are critical for evolutionary inferences - from environmental influences on gene expression.

    2. Reviewer #1 (Public review):

      Summary:

      The authors performed genome assemblies for two Fagaceae species and collected transcriptome data from four natural tree species every month over two years. They identified seasonal gene expression patterns and further analyzed species-specific differences.

      Strengths:

      The study of gene expression patterns in natural environments, as opposed to controlled chambers, is gaining increasing attention. The authors collected RNA-seq data monthly for two years from four tree species and analyzed seasonal expression patterns. The data are novel. The authors could revise the manuscript to emphasize seasonal expression patterns in three species (with one additional species having more limited data). Furthermore, the chromosome-scale genome assemblies for the two Fagaceae species represent valuable resources, although the authors did not cite existing assemblies from closely related species.

      Weaknesses:

      The study design has a fundamental flaw regarding the evaluation of genetic or evolutionary effects. As a basic principle in biology, phenotypes, including gene expression levels, are influenced by genetics, environmental factors, and their interaction. This principle is well-established in quantitative genetics.

      In this study, the four species were sampled from three different sites (see Materials and Methods, lines 543-546), and additionally, two species were sampled from 2019-2021, while the other two were sampled from 2021-2023 (see Figure S2). This critical detail should be clearly described in the Results and Materials and Methods. Due to these variations in sampling sites and periods, environmental conditions are not uniform across species.

      Even in studies conducted in natural environments, there are ways to design experiments that allow genetic effects to be evaluated. For example, by studying co-occurring species, or through transplant experiments, or in common gardens. To illustrate the issue, imagine an experiment where clones of a single species were sampled from three sites and two time periods, similar to the current design. RNA-seq analysis would likely detect differences that could qualitatively resemble those reported in this manuscript.

      One example is in line 197, where genus-specific expression patterns are mentioned. While it may be true that the authors' conclusions (e.g., winter synchronization, phylogenetic constraints) reflect real biological trends, these conclusions are also predictable even without empirical data, and the current dataset does not provide quantitative support.

      If the authors can present a valid method to disentangle genetic and environmental effects from their dataset, that would significantly strengthen the manuscript. However, I do not believe the current study design is suitable for this purpose.

      Unless these issues are addressed, the use of the term "evolution" is inappropriate in this context. The title should be revised, and the result sections starting from "Peak months distribution..." should be either removed or fundamentally revised. The entire Discussion section, which is based on evolutionary interpretation, should be deleted in its current form.

      If the authors still wish to explore genetic or evolutionary analyses, the pair of L. edulis and L. glaber, which were sampled at the same site and over the same period, might be used to analyze "seasonal gene expression divergence in relation to sequence divergence." Nevertheless, the manuscript would benefit from focusing on seasonal expression patterns without framing the study in evolutionary terms.

      To better support the seasonal expression analysis, the early RNA-seq analysis sections should be strengthened. There is little discussion of biological replicate variation or variation among branches of the same individual. These could be important factors to analyze. In line 137, the mapping rate for two species is mentioned, but the rates for each species should be clearly reported. One RNA-seq dataset is based on a species different from the reference genome, so a lower mapping rate is expected. While this likely does not hinder downstream analysis, quantification is important.

      In Figures 2A and 2B, clustering is used to support several points discussed in the Results section (e.g., lines 175-177). However, clustering is primarily a visualization method or a hypothesis-generating tool; it cannot serve as a statistical test. Stronger conclusions would require further statistical testing.

      The quality of the genome assemblies appears adequate, but related assemblies should be cited and discussed. Several assemblies of Fagaceae species already exist, including Quercus mongolica (Ai et al., Mol Ecol Res, 2022), Q. gilva (Front Plant Sci, 2022), and Fagus sylvatica (GigaScience, 2018), among others. Is there any novelty here? Can you compare your results with these existing assemblies?

      Most importantly, Figure 1B-D shows synteny between the two genera but also indicates homology between different chromosomes. Does this suggest paleopolyploidy or another novel feature? These chromosome connections should be interpreted in the main text-even if they could be methodological artifacts.

      In both the Results and Materials and Methods sections, descriptions of genome and RNA-seq data are unclear. In line 128, a paragraph on genome assembly suddenly introduces expression levels. RNA-seq data should be described before this. Similarly, in line 238, the sentence "we assembled high-quality reference genomes" seems disconnected from the surrounding discussion of expression studies. In line 632, Illumina short-read DNA sequencing is mentioned, but it's unclear how these data were used.

    3. Reviewer #2 (Public review):

      Summary:

      This study explores how gene expression evolves in response to seasonal environments, using four evergreen Fagaceae species growing in similar habitats in Japan. By combining chromosome-scale genome assemblies with a two-year RNA-seq time series in leaves and buds, the authors identify seasonal rhythms in gene expression and examine both conserved and divergent patterns. A central result is that winter bud expression is highly conserved across species, likely due to shared physiological demands under cold conditions. One of the intriguing implications of this study is that seasonal cycles might play a role similar to ontogenetic stages in animals. The authors touch on this by comparing their findings to the developmental hourglass model, and indeed, the recurrence of phenological states such as winter dormancy may act as a cyclic form of developmental canalization, shaping expression evolution in a way analogous to embryogenesis in animals.

      Strengths:

      (1) The evolutionary effects of seasonal environments on gene expression are rarely studied at this scale. This paper fills that gap.

      (2) The dataset is extensive, covering two years, two tissues, and four tree species, and is well suited to the questions being asked.

      (3) Transcriptome clustering across species (Figure 2) shows strong grouping by season and tissue rather than species, suggesting that the authors effectively controlled for technical confounders such as batch effects and mapping bias.

      (4) The idea that winter imposes a shared constraint on gene expression, especially in buds, is well argued and supported by the data.

      (5) The discussion links the findings to known concepts like phenological synchrony and the developmental hourglass model, which helps frame the results.

      Weaknesses:

      (1) While the hierarchical clustering shown in Figure 2A largely supports separation by tissue type and season, one issue worth noting is that some leaf samples appear to cluster closely with bud samples. The authors do not comment on this pattern, which raises questions about possible biological overlap between tissues during certain seasonal transitions or technical artifacts such as sample contamination. Clarifying this point would improve confidence in the interpretation of tissue-specific seasonal expression patterns.

      (2) While the study provides compelling evidence of conserved and divergent seasonal gene expression, it does not directly examine the role of cis-regulatory elements or chromatin-level regulatory architecture. Including regulatory genomic or epigenomic data would considerably strengthen the mechanistic understanding of expression divergence.

      (3) The manuscript includes a thoughtful analysis of flowering-related genes and seasonal GO enrichment (e.g., Figure 3C-D), providing an initial link between gene expression timing and phenological functions. However, the analysis remains largely gene-centric, and the study does not incorporate direct measurements of phenological traits (e.g., flowering or bud break dates). As a result, the connection between molecular divergence and phenotypic variation, while suggestive, remains indirect.

      (4) Although species were sampled from similar habitats, one species (Q. acuta) was collected at a higher elevation, and factors such as microclimate or local photoperiod conditions could influence expression patterns. These potential confounding variables are not fully accounted for, and their effects should be more thoroughly discussed or controlled in future analyses.

      (5) Statistical and Interpretive Concerns Regarding Δφ and dN/dS Correlation (Figures 5E and 5F):

      (a) Statistical Inappropriateness: Δφ is a discrete ordinal variable (likely 1-11), making it unsuitable for Pearson correlation, which assumes continuous, normally distributed variables. This undermines the statistical validity of the analysis.

      (b) Biological Interpretability: Even with the substantial statistical power afforded by genome-wide analysis, the observed correlations are extremely weak. This suggests that the relationship, if any, between temporal divergence in expression and protein-coding evolution is negligible.

      Taken together, these issues weaken the case for any biologically meaningful association between Δφ and dN/dS. I recommend either omitting these panels or clearly reframing them as exploratory and statistically limited observations.

    4. Author response:

      Reviewer #1 (Public review):

      Summary:

      The authors performed genome assemblies for two Fagaceae species and collected transcriptome data from four natural tree species every month over two years. They identified seasonal gene expression patterns and further analyzed species-specific differences.

      Strengths:

      The study of gene expression patterns in natural environments, as opposed to controlled chambers, is gaining increasing attention. The authors collected RNA-seq data monthly for two years from four tree species and analyzed seasonal expression patterns. The data are novel. The authors could revise the manuscript to emphasize seasonal expression patterns in three species (with one additional species having more limited data). Furthermore, the chromosome-scale genome assemblies for the two Fagaceae species represent valuable resources, although the authors did not cite existing assemblies from closely related species.

      Thank you for your careful assessment of our manuscript.

      Weaknesses:

      Comment; The study design has a fundamental flaw regarding the evaluation of genetic or evolutionary effects. As a basic principle in biology, phenotypes, including gene expression levels, are influenced by genetics, environmental factors, and their interaction. This principle is well-established in quantitative genetics.

      In this study, the four species were sampled from three different sites (see Materials and Methods, lines 543-546), and additionally, two species were sampled from 2019-2021, while the other two were sampled from 2021-2023 (see Figure S2). This critical detail should be clearly described in the Results and Materials and Methods. Due to these variations in sampling sites and periods, environmental conditions are not uniform across species.

      Even in studies conducted in natural environments, there are ways to design experiments that allow genetic effects to be evaluated. For example, by studying co-occurring species, or through transplant experiments, or in common gardens. To illustrate the issue, imagine an experiment where clones of a single species were sampled from three sites and two time periods, similar to the current design. RNA-seq analysis would likely detect differences that could qualitatively resemble those reported in this manuscript.

      One example is in line 197, where genus-specific expression patterns are mentioned. While it may be true that the authors' conclusions (e.g., winter synchronization, phylogenetic constraints) reflect real biological trends, these conclusions are also predictable even without empirical data, and the current dataset does not provide quantitative support.

      If the authors can present a valid method to disentangle genetic and environmental effects from their dataset, that would significantly strengthen the manuscript. However, I do not believe the current study design is suitable for this purpose.

      Unless these issues are addressed, the use of the term "evolution" is inappropriate in this context. The title should be revised, and the result sections starting from "Peak months distribution..." should be either removed or fundamentally revised. The entire Discussion section, which is based on evolutionary interpretation, should be deleted in its current form.

      If the authors still wish to explore genetic or evolutionary analyses, the pair of L. edulis and L. glaber, which were sampled at the same site and over the same period, might be used to analyze "seasonal gene expression divergence in relation to sequence divergence." Nevertheless, the manuscript would benefit from focusing on seasonal expression patterns without framing the study in evolutionary terms.

      We sincerely thank the reviewer for the detailed and thoughtful comments. We fully recognize the importance of carefully distinguishing genetic and environmental contributions in transcriptomic studies, particularly when addressing evolutionary questions. The reviewer identified two major concerns regarding our study design: (1) the use of different monitoring periods across species, and (2) the use of samples collected from different study sites. We addressed both concerns with additional analyses using 112 new samples and now present new evidence that supports the robustness of our conclusions.

      (1) Monitoring period variation does not bias our conclusions

      To address concerns about the differing monitoring periods, we added new RNA-seq data (42 samples each for bud and leaf samples for L. glaber and 14 samples each for bud and leaf samples for L. edulis) collected from November 2021 to November 2022, enabling direct comparison across species within a consistent timeframe. Hierarchical clustering of this expanded dataset (Fig. S6) yielded results consistent with our original findings: winter-collected samples cluster together regardless of species identity. This strongly supports our conclusion that the seasonal synchrony observed in winter is not an artifact of the monitoring period and demonstrates the robustness of our conclusions across datasets.

      (2) Site variation is limited and does not confound our findings

      Although the study included three sites, two of them (Imajuku and Ito Campus) are only 7.3 km apart, share nearly identical temperature profiles (see Fig. S2), and are located at the edge of similar evergreen broadleaf forests. Only Q. acuta was sampled from a higher-altitude, cooler site. To assess whether the higher elevation site of Q. acuta introduced confounding environmental effects, we reanalyzed the data after excluding this species. Hierarchical clustering still revealed that winter bud samples formed a distinct cluster regardless of species identity (Fig. S7), consistent with our original finding.

      Furthermore, we recalculated the molecular phenology divergence index D (Fig. 4C) and the interspecific Pearson’s correlation coefficients (Fig. 5A) without including Q. acuta. These analyses produced results that were similar to those obtained from the full dataset (Fig. S12; Fig. S14), indicating that the observed patterns are not driven by environmental differences associated with elevation.

      (3) Justification for our approach in natural systems

      We agree with the reviewer that experimental approaches such as common gardens, reciprocal transplants, and the use of co-occurring species are valuable for disentangling genetic and environmental effects. In fact, we have previously implemented such designs in studies using the perennial herb Arabidopsis halleri (Komoto et al., 2022, https://doi.org/10.1111/pce.14716) and clonal Someiyoshino cherry trees (Miyawaki-Kuwakado et al., 2024, https://doi.org/10.1002/ppp3.10548) to examine environmental effects on gene expression. However, extending these approaches to long-lived tree species in diverse natural ecosystems poses significant logistical and biological challenges. In this study, we addressed this limitation by including three co-occurring species at the same site, which allowed us to evaluate interspecific differences under comparable environmental conditions. Importantly, even when we limited our analyses to these co-occurring species, the results remained consistent, indicating that the observed variation in transcriptomic profiles cannot be attributed to environmental factors alone and likely reflects underlying genetic influences.

      Accordingly, we added four new figures (Fig. S6, Fig. S7, Fig. S12 and Fig. S14) and revised the manuscript to clarify the limitations and strengths of our design, to tone down the evolutionary claims where appropriate, and to more explicitly define the scope of our conclusions in light of the data. We hope that these efforts sufficiently address the reviewer’s concerns and strengthen the manuscript.

      To better support the seasonal expression analysis, the early RNA-seq analysis sections should be strengthened. There is little discussion of biological replicate variation or variation among branches of the same individual. These could be important factors to analyze. In line 137, the mapping rate for two species is mentioned, but the rates for each species should be clearly reported. One RNA-seq dataset is based on a species different from the reference genome, so a lower mapping rate is expected. While this likely does not hinder downstream analysis, quantification is important.

      We thank the reviewer 1 for the helpful comment. To evaluate the variation among biological replicates, we compared the expression level of each gene across different individuals. We observed high correlation between each pair of individuals (Q. glauca (n=3): an average correlation coefficient r = 0.947; Q. acuta (n=3): r = 0.948; L. glaber (n=3): r = 0.948)). This result suggests that the seasonal gene expression pattern is highly synchronized across individuals within the same species. We mentioned this point in the Result section in the revised manuscript. We also calculated the mean mapping rates for each species. As the reviewer expected, the mapping rate was slightly lower in Q. acuta (88.6 ± 2.3%) and L. glaber (84.3 ± 5.4%), whose RNA-Seq data were mapped to reference genomes of related but different species, compared to that in Q. glauca (92.6 ± 2.2%) and L. edulis (89.3 ± 2.7%). However, we minimized the impact of these differences on downstream analysis. These details have been included in the revised main text.

      In Figures 2A and 2B, clustering is used to support several points discussed in the Results section (e.g., lines 175-177). However, clustering is primarily a visualization method or a hypothesis-generating tool; it cannot serve as a statistical test. Stronger conclusions would require further statistical testing.

      We thank the reviewer for the helpful comment. As noted, we acknowledge that hierarchical clustering (Fig. 2A) is primarily a visualization and hypothesis-generating method. To assess the biological relevance of the clusters identified, we conducted a Mann-Whitney U test or the Steel-Dwass test to evaluate whether the environmental temperatures at the time of sample collection differed significantly among the clusters. This analysis (Fig. 2B) revealed statistically significant differences in temperature in the cluster B3 (p < 0.01), indicating that the gene expression clusters are associated with seasonal thermal variation. These results support the interpretation that the clusters reflect coordinated transcriptional responses to environmental temperature. We revised the Results section to clarify this point.

      The quality of the genome assemblies appears adequate, but related assemblies should be cited and discussed. Several assemblies of Fagaceae species already exist, including Quercus mongolica (Ai et al., Mol Ecol Res, 2022), Q. gilva (Front Plant Sci, 2022), and Fagus sylvatica (GigaScience, 2018), among others. Is there any novelty here? Can you compare your results with these existing assemblies?

      We agree that genome assemblies of Fagaceae species are becoming increasing available. However, our study does not aim to emphasize the novelty of the genome assemblies per se. Rather, with the increasing availability of chromosome-level genomes, we regard genome assembly as a necessary foundation for more advanced analyses. The main objective of our study is to investigate how each gene is expressed in response to seasonal environmental changes, and to link genome information with seasonal transcriptomic dynamics. To address the reviewer’s comment in line with this objective, we added a discussion on the syntenic structure of eight genome assemblies spanning four genera within the Fagaceae, including a species from the genus Fagus (Ikezaki et al. 2025, https://doi.org/10.1101/2025.07.31.667835). This addition helps to position our work more clearly within the context of existing genomic resources.

      Most importantly, Figure 1B-D shows synteny between the two genera but also indicates homology between different chromosomes. Does this suggest paleopolyploidy or another novel feature? These chromosome connections should be interpreted in the main text-even if they could be methodological artifacts.

      A previous study on genome size variation in Fagaceae suggested that, given the consistent ploidy level across the family, genome expansion likely occurred through relatively small segmental duplications rather than whole-genome duplications. Because Figure 1B-D supports this view, we cited the following reference in the revised version of the manuscript.

      Chen et al. (2014)  https://doi.org/10.1007/s11295-014-0736-y

      In both the Results and Materials and Methods sections, descriptions of genome and RNA-seq data are unclear. In line 128, a paragraph on genome assembly suddenly introduces expression levels. RNA-seq data should be described before this. Similarly, in line 238, the sentence "we assembled high-quality reference genomes" seems disconnected from the surrounding discussion of expression studies. In line 632, Illumina short-read DNA sequencing is mentioned, but it's unclear how these data were used.

      We relocated the explanation regarding the expression levels of single-copy and multi-copy genes to the section titled “Seasonal gene expression dynamics.” Additionally, we clarified in the Materials and Methods section that short-read sequencing data were used for both genome size estimation and phylogenetic reconstruction.

      Reviewer #2 (Public review):

      Summary:

      This study explores how gene expression evolves in response to seasonal environments, using four evergreen Fagaceae species growing in similar habitats in Japan. By combining chromosome-scale genome assemblies with a two-year RNA-seq time series in leaves and buds, the authors identify seasonal rhythms in gene expression and examine both conserved and divergent patterns. A central result is that winter bud expression is highly conserved across species, likely due to shared physiological demands under cold conditions. One of the intriguing implications of this study is that seasonal cycles might play a role similar to ontogenetic stages in animals. The authors touch on this by comparing their findings to the developmental hourglass model, and indeed, the recurrence of phenological states such as winter dormancy may act as a cyclic form of developmental canalization, shaping expression evolution in a way analogous to embryogenesis in animals.

      Strengths:

      (1) The evolutionary effects of seasonal environments on gene expression are rarely studied at this scale. This paper fills that gap.

      (2) The dataset is extensive, covering two years, two tissues, and four tree species, and is well suited to the questions being asked.

      (3) Transcriptome clustering across species (Figure 2) shows strong grouping by season and tissue rather than species, suggesting that the authors effectively controlled for technical confounders such as batch effects and mapping bias.

      (4) The idea that winter imposes a shared constraint on gene expression, especially in buds, is well argued and supported by the data.

      (5) The discussion links the findings to known concepts like phenological synchrony and the developmental hourglass model, which helps frame the results.

      We are grateful for the reviewer for the detailed and thoughtful review of our manuscript.

      Weaknesses:

      (1) While the hierarchical clustering shown in Figure 2A largely supports separation by tissue type and season, one issue worth noting is that some leaf samples appear to cluster closely with bud samples. The authors do not comment on this pattern, which raises questions about possible biological overlap between tissues during certain seasonal transitions or technical artifacts such as sample contamination. Clarifying this point would improve confidence in the interpretation of tissue-specific seasonal expression patterns.

      Leaf samples clustered into the bud are newly flushed leaves collected in April for Q. glauca, May for Q. acuta, May and June for L. edulis, and August and September for L. glaber. To clarify this point, we highlighted these newly flushed leaf samples as asterisk in the revised figure (Fig. 2A).

      comment; (2) While the study provides compelling evidence of conserved and divergent seasonal gene expression, it does not directly examine the role of cis-regulatory elements or chromatin-level regulatory architecture. Including regulatory genomic or epigenomic data would considerably strengthen the mechanistic understanding of expression divergence.

      We thank the reviewer for this insightful comment. As noted in the Discussion section, we hypothesize that such genome-wide seasonal expression patterns—and their divergence across species—are likely mediated by cis-regulatory elements and chromatin-level mechanisms. While a direct investigation of regulatory architecture was beyond the scope of the present study, we fully agree that incorporating regulatory genomic and epigenomic data would significantly deepen the mechanistic understanding of expression divergence. In this regard, we are currently working to identify putative cis-regulatory elements in non-coding regions and are collecting epigenetic data from the same tree species using ChIP-seq. We believe the current study provide a foundation for these future investigations into the regulatory basis of seasonal transcriptome variation. We made a minor revision to the Discussion to note that an important future direction is to investigate the evolution of non-coding sequences that regulate gene expression in response to seasonal environmental changes.

      (3) The manuscript includes a thoughtful analysis of flowering-related genes and seasonal GO enrichment (e.g., Figure 3C-D), providing an initial link between gene expression timing and phenological functions. However, the analysis remains largely gene-centric, and the study does not incorporate direct measurements of phenological traits (e.g., flowering or bud break dates). As a result, the connection between molecular divergence and phenotypic variation, while suggestive, remains indirect.

      We would like to note that phenological traits have been observed in the field on a monthly basis throughout the sampling period and the phenological data were plotted together with molecular phenology (e.g. Fig. 2A, C; Fig. 3C, D). Although the temporal resolution is limited, these observations captured species-specific differences in key phenological events such as leaf flushing and flowering times. We revised the manuscript to clarify this point.

      (4) Although species were sampled from similar habitats, one species (Q. acuta) was collected at a higher elevation, and factors such as microclimate or local photoperiod conditions could influence expression patterns. These potential confounding variables are not fully accounted for, and their effects should be more thoroughly discussed or controlled in future analyses.

      We fully agree with the reviewer that local environmental conditions, including microclimate and photoperiod differences, could potentially influence gene expression patterns. To assess whether the higher elevation site of Q. acuta introduced confounding environmental effects, we reanalyzed the data after excluding this species. Hierarchical clustering still revealed that winter bud samples formed a distinct cluster regardless of species identity (Fig. S7), consistent with our original finding.

      Furthermore, we recalculated the molecular phenology divergence index D (Fig. 4C) and the interspecific Pearson’s correlation coefficients (Fig. 5A) without including Q. acuta. These analyses produced results that were qualitatively similar to those obtained from the full dataset (Fig. S12; Fig. S14), indicating that the observed patterns are not driven by environmental differences associated with elevation.

      We believe these additional analyses help to decouple the effects of environment and genetics, and support our conclusion that both seasonal synchrony and phylogenetic constraints play key roles in shaping transcriptome dynamics. We added four new figures (Fig. S6, Fig. S7, Fig. S12 and Fig. S14) and revised the text accordingly to clarify this point and to acknowledge the potential impact of site-specific environmental variation.

      (5) Statistical and Interpretive Concerns Regarding Δφ and dN/dS Correlation (Figures 5E and 5F):

      (a) Statistical Inappropriateness: Δφ is a discrete ordinal variable (likely 1-11), making it unsuitable for Pearson correlation, which assumes continuous, normally distributed variables. This undermines the statistical validity of the analysis.

      We thank the reviewer for the insightful comment. We would like to clarify that the analysis presented in Figures 5E and 5F was based on linear regression, not Pearson’s correlation. Although Δφ is a discrete variable, it takes values from 0 to 6 in 0.5 increments, resulting in 13 levels. We treated it as a quasi-continuous variable for the purposes of linear regression analysis. This approach is commonly adopted in practice when a discrete variable has sufficient resolution and ordering to approximate continuity. To enhance clarity, we revised the manuscript to explicitly state that linear regression was used, and we now reported the regression coefficient and associated p-value to support the interpretation of the observed trend.

      (b) Biological Interpretability: Even with the substantial statistical power afforded by genome-wide analysis, the observed correlations are extremely weak. This suggests that the relationship, if any, between temporal divergence in expression and protein-coding evolution is negligible.

      Taken together, these issues weaken the case for any biologically meaningful association between Δφ and dN/dS. I recommend either omitting these panels or clearly reframing them as exploratory and statistically limited observations.

      We agree with the reviewer’s comment. While we retained the original panels, we reframed our interpretation to emphasize that, despite statistical significance, the observed correlation is very weak—suggesting that coding region variation is unlikely to be the primary driver of seasonal gene expression patterns. Accordingly, we revised the “Relating seasonal gene expression divergence to sequence divergence” section in the Results, as well as the relevant part of the Discussion.

    1. eLife Assessment

      This important study introduces an advance in multi-animal tracking by reframing identity assignment as a self-supervised contrastive representation learning problem. It eliminates the need for segments of video where all animals are simultaneously visible and individually identifiable, and significantly improves tracking speed, accuracy, and robustness with respect to occlusion. This innovation has implications beyond animal tracking, potentially connecting with advances in behavioral analysis and computer vision. While the strength of support for these advances is solid overall, the presentation could be greatly improved for clarity and broader accessibility; in addition, incorporating more standard metrics in the multi-animal tracking literature would better benchmark the approach against other methods.

    2. Reviewer #1 (Public review):

      Summary:

      This is a strong paper that presents a clear advance in multi-animal tracking. The authors introduce an updated version of idtracker.ai that reframes identity assignment as a contrastive learning problem rather than a classification task requiring global fragments. This change leads to gains in speed and accuracy. The method eliminates a known bottleneck in the original system, and the benchmarking across species is comprehensive and well executed. I think the results are convincing and the work is significant.

      Strengths:

      The main strengths are the conceptual shift from classification to representation learning, the clear performance gains, and the fact that the new version is more robust. Removing the need for global fragments makes the software more flexible in practice, and the accuracy and speed improvements are well demonstrated. The software appears thoughtfully implemented, with GUI updates and integration with pose estimators.

      Weaknesses:

      I don't have any major criticisms, but I have identified a few points that should be addressed to improve the clarity and accuracy of the claims made in the paper.

      (1) The title begins with "New idtracker.ai," which may not age well and sounds more promotional than scientific. The strength of the work is the conceptual shift to contrastive representation learning, and it might be more helpful to emphasize that in the title rather than branding it as "new."

      (2) Several technical points regarding the comparison between TRex (a system evaluated in the paper) and idtracker.ai should be addressed to ensure the evaluation is fair and readers are fully informed.

      (2.1) Lines 158-160: The description of TRex as based on "Protocol 2 of idtracker.ai" overlooks several key additions in TRex, such as posture image normalization, tracklet subsampling, and the use of uniqueness feedback during training. These features are not acknowledged, and it's unclear whether TRex was properly configured - particularly regarding posture estimation, which appears to have been omitted but isn't discussed. Without knowing the actual parameters used to make comparisons, it's difficult to assess how the method was evaluated.

      (2.2) Lines 162-163: The paper implies that TRex gains speed by avoiding Protocol 3, but in practice, idtracker.ai also typically avoids using Protocol 3 due to its extremely long runtime. This part of the framing feels more like a rhetorical contrast than an informative one.

      (2.3) Lines 277-280: The contrastive loss function is written using the label l, but since it refers to a pair of images, it would be clearer and more precise to write it as l_{I,J}. This would help readers unfamiliar with contrastive learning understand the formulation more easily.

      (2.4) Lines 333-334: The manuscript states that TRex can fail to track certain videos, but this may be inaccurate depending on how the authors classify failures. TRex may return low uniqueness scores if training does not converge well, but this isn't equivalent to tracking failure. Moreover, the metric reported by TRex is uniqueness, not accuracy. Equating the two could mislead readers. If the authors did compare outputs to human-validated data, that should be stated more explicitly.

      (2.5) Lines 339-341: The evaluation approach defines a "successful run" and then sums the runtime across all attempts up to that point. If success is defined as simply producing any output, this may not reflect how experienced users actually interact with the software, where parameters are iteratively refined to improve quality.

      (2.6) Lines 344-346: The simulation process involves sampling tracking parameters 10,000 times and selecting the first "successful" run. If parameter tuning is randomized rather than informed by expert knowledge, this could skew the results in favor of tools that require fewer or simpler adjustments. TRex relies on more tunable behavior, such as longer fragments improving training time, which this approach may not capture.

      (2.7) Line 354 onward: TRex was evaluated using two varying parameters (threshold and track_max_speed), while idtracker.ai used only one (intensity_threshold). With a fixed number of samples, this asymmetry could bias results against TRex. In addition, users typically set these parameters based on domain knowledge rather than random exploration.

      (2.8) Figure 2-figure supplement 3: The memory usage comparison lacks detail. It's unclear whether RAM or VRAM was measured, whether shared or compressed memory was included, or how memory was sampled. Since both tools dynamically adjust to system resources, the relevance of this comparison is questionable without more technical detail.

      (3) While the authors cite several key papers on contrastive learning, they do not use the introduction or discussion to effectively situate their approach within related fields where similar strategies have been widely adopted. For example, contrastive embedding methods form the backbone of modern facial recognition and other image similarity systems, where the goal is to map images into a latent space that separates identities or classes through clustering. This connection would help emphasize the conceptual strength of the approach and align the work with well-established applications. Similarly, there is a growing literature on animal re-identification (ReID), which often involves learning identity-preserving representations across time or appearance changes. Referencing these bodies of work would help readers connect the proposed method with adjacent areas using similar ideas, and show that the authors are aware of and building on this wider context.

      (4) Some sections of the Results text (e.g., lines 48-74) read more like extended figure captions than part of the main narrative. They include detailed explanations of figure elements, sorting procedures, and video naming conventions that may be better placed in the actual figure captions or moved to supplementary notes. Streamlining this section in the main text would improve readability and help the central ideas stand out more clearly.

      Overall, though, this is a high-quality paper. The improvements to idtracker.ai are well justified and practically significant. Addressing the above comments will strengthen the work, particularly by clarifying the evaluation and comparisons.

    3. Reviewer #2 (Public review):

      This work introduces a new version of the state-of-the-art idtracker.ai software for tracking multiple unmarked animals. The authors aimed to solve a critical limitation of their previous software, which relied on the existence of "global fragments" (video segments where all animals are simultaneously visible) to train an identification classifier network, in addition to addressing concerns with runtime speed. To do this, the authors have both re-implemented the backend of their software in PyTorch (in addition to numerous other performance optimizations) as well as moving from a supervised classification framework to a self-supervised, contrastive representation learning approach that no longer requires global fragments to function. By defining positive training pairs as different images from the same fragment and negative pairs as images from any two co-existing fragments, the system cleverly takes advantage of partial (but high-confidence) tracklets to learn a powerful representation of animal identity without direct human supervision. Their formulation of contrastive learning is carefully thought out and comprises a series of empirically validated design choices that are both creative and technically sound. This methodological advance is significant and directly leads to the software's major strengths, including exceptional performance improvements in speed and accuracy and a newfound robustness to occlusion (even in severe cases where no global fragments can be detected). Benchmark comparisons show the new software is, on average, 44 times faster (up to 440 times faster on difficult videos) while also achieving higher accuracy across a range of species and group sizes. This new version of idtracker.ai is shown to consistently outperform the closely related TRex software (Walter & Couzin, 2021\), which, together with the engineering innovations and usability enhancements (e.g., outputs convenient for downstream pose estimation), positions this tool as an advancement on the state-of-the-art for multi-animal tracking, especially for collective behavior studies.

      Despite these advances, we note a number of weaknesses and limitations that are not well addressed in the present version of this paper:

      (1) The contrastive representation learning formulation

      Contrastive representation learning using deep neural networks has long been used for problems in the multi-object tracking domain, popularized through ReID approaches like DML (Yi et al., 2014\) and DeepReID (Li et al., 2014). More recently, contrastive learning has become more popular as an approach for scalable self-supervised representation learning for open-ended vision tasks, as exemplified by approaches like SimCLR (Chen et al., 2020), SimSiam (Chen et al., 2020\), and MAE (He et al., 2021\) and instantiated in foundation models for image embedding like DINOv2 (Oquab et al., 2023). Given their prevalence, it is useful to contrast the formulation of contrastive learning described here relative to these widely adopted approaches (and why this reviewer feels it is appropriate):

      (1.1) No rotations or other image augmentations are performed to generate positive examples. These are not necessary with this approach since the pairs are sampled from heuristically tracked fragments (which produces sufficient training data, though see weaknesses discussed below) and the crops are pre-aligned egocentrically (mitigating the need for rotational invariance).

      (1.2) There is no projection head in the architecture, like in SimCLR. Since classification/clustering is the only task that the system is intended to solve, the more general "nuisance" image features that this architectural detail normally affords are not necessary here.

      (1.3) There is no stop gradient operator like in BYOL (Grill et al., 2020\) or SimSiam. Since the heuristic tracking implicitly produces plenty of negative pairs from the fragments, there is no need to prevent representational collapse due to class asymmetry. Some care is still needed, but the authors address this well through a pair sampling strategy (discussed below).

      (1.4) Euclidean distance is used as the distance metric in the loss rather than cosine similarity as in most contrastive learning works. While cosine similarity coupled with L2-normalized unit hypersphere embeddings has proven to be a successful recipe to deal with the curse of dimensionality (with the added benefit of bounded distance limits), the authors address this through a cleverly constructed loss function that essentially allows direct control over the intra- and inter-cluster distance (D\_pos and D\_neg). This is a clever formulation that aligns well with the use of K-means for the downstream assignment step.

      No concerns here, just clarifications for readers who dig into the review. Referencing the above literature would enhance the presentation of the paper to align with the broader computer vision literature.

      (2) Network architecture for image feature extraction backbone

      As most of the computations that drive up processing time happen in the network backbone, the authors explored a variety of architectures to assess speed, accuracy, and memory requirements. They land on ResNet18 due to its empirically determined performance. While the experiments that support this choice are solid, the rationale behind the architecture selection is somewhat weak. The authors state that:

      "\[W\]e tested 23 networks from 8 different families of state-of-the-art convolutional neural network architectures, selected for their compatibility with consumer-grade GPUs and ability to handle small input images (20 × 20 to 100 × 100 pixels) typical in collective animal behavior videos."

      (2.1) Most modern architectures have variants that are compatible with consumer-grade GPUs. This is true of, for example, HRNet (Wang et al., 2019), ViT (Dosovitskiy et al., 2020), SwinT (Liu et al., 2021), or ConvNeXt (Liu et al., 2022), all of which report single GPU training and fast runtime speeds through lightweight configuration or subsequent variants, e.g., MobileViT (Mehta et al., 2021). The authors may consider revising that statement or providing additional support for that claim (e.g., empirical experiments) given that these have been reported to outperform ResNet18 across tasks.

      (2.2) The compatibility of different architectures with small image sizes is configurable. Most convolutional architectures can be readily adapted to work with smaller image sizes, including 20x20 crops. With their default configuration, they lose feature map resolution through repeated pooling and downsampling steps, but this can be readily mitigated by swapping out standard convolutions with dilated convolutions and/or by setting the stride of pooling layers to 1, preserving feature map resolution across blocks. While these are fairly straightforward modifications (and are even compatible with using pretrained weights), an even more trivial approach is to pad and/or resize the crops to the default image size, which is likely to improve accuracy at a possibly minimal memory and runtime cost. These techniques may even improve the performance with the architectures that the authors did test out.

      (2.3) The authors do not report whether the architecture experiments were done with pretrained or randomly initialized weights.

      (2.4) The authors do not report some details about their ResNet18 design, specifically whether a global pooling layer is used and whether the output fully connected layer has any activation function. Additionally, they do not report the version of ResNet18 employed here, namely, whether the BatchNorm and ReLU are applied after (v1) or before (v2) the conv layers in the residual path.

      (3) Pair sampling strategy

      The authors devised a clever approach for sampling positive and negative pairs that is tailored to the nature of the formulation. First, since the positive and negative labels are derived from the co-existence of pretracked fragments, selection has to be done at the level of fragments rather than individual images. This would not be the case if one of the newer approaches for contrastive learning were employed, but it serves as a strength here (assuming that fragment generation/first pass heuristic tracking is achievable and reliable in the dataset). Second, a clever weighted sampling scheme assigns sampling weights to the fragments that are designed to balance "exploration and exploitation". They weigh samples both by fragment length and by the loss associated with that fragment to bias towards different and more difficult examples.

      (3.1) The formulation described here resembles and uses elements of online hard example mining (Shrivastava et al., 2016), hard negative sampling (Robinson et al., 2020\), and curriculum learning more broadly. The authors may consider referencing this literature (particularly Robinson et al., 2020\) for inspiration and to inform the interpretation of the current empirical results on positive/negative balancing.

      (4) Speed and accuracy improvements

      The authors report considerable improvements in speed and accuracy of the new idTracker (v6) over the original idTracker (v4?) and TRex. It's a bit unclear, however, which of these are attributable to the engineering optimizations (v5?) versus the representation learning formulation.

      (4.1) Why is there an improvement in accuracy in idTracker v5 (L77-81)? This is described as a port to PyTorch and improvements largely related to the memory and data loading efficiency. This is particularly notable given that the progression went from 97.52% (v4; original) to 99.58% (v5; engineering enhancements) to 99.92% (v6; representation learning), i.e., most of the new improvement in accuracy owes to the "optimizations" which are not the central emphasis of the systematic evaluations reported in this paper.

      (4.2) What about the speed improvements? Relative to the original (v4), the authors report average speed-ups of 13.6x in v5 and 44x in v6. Presumably, the drastic speed-up in v6 comes from a lower Protocol 2 failure rate, but v6 is not evaluated in Figure 2 - figure supplement 2.

      (5) Robustness to occlusion

      A major innovation enabled by the contrastive representation learning approach is the ability to tolerate the absence of a global fragment (contiguous frames where all animals are visible) by requiring only co-existing pairs of fragments owing to the paired sampling formulation. While this removes a major limitation of the previous versions of idtracker.ai, its evaluation could be strengthened. The authors describe an ablation experiment where an arc of the arena is masked out to assess the accuracy under artificially difficult conditions. They find that the v6 works robustly up to significant proportions of occlusions, even when doing so eliminates global fragments.

      (5.1) The experiment setup needs to be more carefully described.<br /> What does the masking procedure entail? Are the pixels masked out in the original video or are detections removed after segmentation and first pass tracking is done?<br /> What happens at the boundary of the mask? (Partial segmentation masks would throw off the centroids, and doing it after original segmentation does not realistically model the conditions of entering an occlusion area.)<br /> Are fragments still linked for animals that enter and then exit the mask area?<br /> How is the evaluation done? Is it computed with or without the masked region detections?

      (5.2) The circular masking is perhaps not the most appropriate for the mouse data, which is collected in a rectangular arena.

      (5.3) The number of co-existing fragments, which seems to be the main determinant of performance that the authors derive from this experiment, should be reported for these experiments. In particular, a "number of co-existing fragments" vs accuracy plot would support the use of the 0.25(N-1) heuristic and would be especially informative for users seeking to optimize experimental and cage design. Additionally, the number of co-existing fragments can be artificially reduced in other ways other than a fixed occlusion, including random dropout, which would disambiguate it from potential allocentric positional confounds (particularly relevant in arenas where egocentric pose is correlated with allocentric position).

      (6) Robustness to imaging conditions

      The authors state that "the new idtracker.ai can work well with lower resolutions, blur and video compression, and with inhomogeneous light (Figure 2 - figure supplement 4)." (L156).

      Despite this claim, there are no speed or accuracy results reported for the artificially corrupted data, only examples of these image manipulations in the supplementary figure.

      (7) Robustness across longitudinal or multi-session experiments

      The authors reference idmatcher.ai as a compatible tool for this use case (matching identities across sessions or long-term monitoring across chunked videos), however, no performance data is presented to support its usage.

      This is relevant as the innovations described here may interact with this setting. While deep metric learning and contrastive learning for ReID were originally motivated by these types of problems (especially individuals leaving and entering the FOV), it is not clear that the current formulation is ideally suited for this use case. Namely, the design decisions described in point 1 of this review are at times at odds with the idea of learning generalizable representations owing to the feature extractor backbone (less scalable), low-dimensional embedding size (less representational capacity), and Euclidean distance metric without hypersphere embedding (possible sensitivity to drift).

      It's possible that data to support point 6 can mitigate these concerns through empirical results on variations in illumination, but a stronger experiment would be to artificially split up a longer video into shorter segments and evaluate how generalizable and stable the representations learned in one segment are across contiguous ("longitudinal") or discontiguous ("multi-session") segments.

    4. Reviewer #3 (Public review):

      Summary:

      The authors propose a new version of idTracker.ai for animal tracking. Specifically, they apply contrastive learning to embed cropped images of animals into a feature space where clusters correspond to individual animal identities.

      Strengths:

      By doing this, the new software alleviates the requirement for so-called global fragments - segments of the video, in which all entities are visible/detected at the same time - which was necessary in the previous version of the method. In general, the new method reduces the tracking time compared to the previous versions, while also increasing the average accuracy of assigning the identity labels.

      Weaknesses:

      The general impression of the paper is that, in its current form, it is difficult to disentangle the old from the new method and understand the method in detail. The manuscript would benefit from a major reorganization and rewriting of its parts. There are also certain concerns about the accuracy metric and reducing the computational time.

    5. Author response:

      We thank the editor and reviewers for their positive and detailed review of the preprint. We will use these comments to improve the manuscript's revised version, which we plan to submit in the coming weeks, including: a) tests of variants of ResNet, other network architectures and the use of pre-trained weights, b) clarification and justification of the accuracy metrics used in the benchmark, c) an expanded study about the fragment connectivity in Figure 3, and d) a study the performance of idmatcher.ai with the new idtracker.ai.

    1. eLife Assessment

      This useful study presents interesting observations on the potential importance of extracellular transport of human papillomaviruses along actin protrusions by retrograde flow. The focus on the events of HPV infection between ECM binding and keratinocyte-specific receptor binding is unique and interesting. However, the evidence supporting the conclusions is incomplete, and additional experimental support is needed. Because conclusions drawn regarding HS interactions are largely based on experiments using a single HS mAb, the specificity of this mAb needs to be described in more detail, either based on the literature or further experimentation.

    2. Reviewer #1 (Public review):

      The authors' goal was to arrest PsV capsids on the extracellular matrix using cytochalasin D. The cohort was then released, and interaction with the cell surface, specifically with CD151, was assessed.

      The model that fragmented HS associated with released virions mediates the dominant mechanism of infectious entry has only been suggested by research from a single laboratory and has not been verified in the 10+ years since publication. The authors are basing this study on the assumption that this model is correct, and these data are referred to repeatedly as the accepted model despite much evidence to the contrary. The discussion in lines 65-71 concerning virion and HSPG affinity changes is greatly simplified. The structural changes in the capsid induced by HS interaction and the role of this priming for KLK8 and furin cleavage have been well researched. Multiple laboratories have independently documented this. If this study aims to verify the shedding model, additional data need to be provided. The model should be fitted into established entry events, or at minimum, these conflicting data, a subset of which is noted below, need to be acknowledged.

      (1) The Sapp lab (Richards et al., 2013) found that HSPG-mediated conformational changes in L1 and L2 allowed the release of the virus from primary binding and allowing secondary receptor engagements in the absence of HS shedding.

      (2) Becker et al. found that furin-precleaved capsids could infect cells independently of HSPG interaction, but this infection was still inhibited with cytochalasin D.

      (3) Other work from the Schelhaas lab showed that cytochalasin D inhibition of infection resulted in the accumulation of capsids in deep invaginations from the cell surface, not on the ECM.

      (4) Selinka et al., 2007, showed that preventing HSPG-induced conformational changes in the capsid surface resulted in noninfectious uptake that was not prevented with cytochalasin D.

      (5) The well-described capsid processing events by KLK8 and furin need to be mechanistically linked to the proposed model. Does inhibition of either of these cleavages prevent engagement with CD151?

      The authors need to consider an explanation for these discrepancies.

      Other issues:

      (1) Line 110-111. The statement about PsVs in the ECM being too far away from the cell surface to make physical contact with the cell surface entry receptors is confusing. ECM binding has not been shown to be an obligatory step for in vitro infection. This idea is referred to again on lines 158-159 and 199. The claim (line 158) that PsV does not interact with the cell within an hour needs to be demonstrated experimentally and seems at odds with multiple laboratories' data. PsV has been shown to directly interact with HSPG on the cell surface in addition to the ECM. Why are these PsVs not detected?

      (2) The experiments shown in Figure 5 need to be better controlled. Why is there no HS staining of the cell surface at the early timepoints? This antibody has been shown to recognize N-sulfated glucosamine residues on HS and, therefore, detects HSPG on the ECM and cell surface. Therefore, the conclusion that this confirms HS coating of PsV during release from the ECM (line 430-431) is unfounded. How do the authors distinguish between "HS-coated virions" and HSPG-associated virions?

      It is difficult to comprehend how the addition of 50 vge/cell of PsV could cause such a global change in HS levels. The claim that the HS levels are decreased in the non-cytochalasin-treated cells due to PsV-induced shedding needs to be demonstrated. If HS is actually shed, staining of the cell periphery could increase with the antibody 3G10, which detects the HS neoepitope created following heparinase cleavage.

    3. Reviewer #2 (Public review):

      Summary:

      Massenberg and colleagues aimed to understand how Human papillomavirus particles that bind to the extracellular matrix (ECM) transfer to the cell body for later uptake, entry, and infection. The binding to ECM is key for getting close to the virus's host cell (basal keratinocytes) after a wounding scenario for later infection in a mouse vaginal challenge model, indicating that this is an important question in the field.

      Strengths:

      The authors take on a conceptually interesting and potentially very important question to understand how initial infection occurs in vivo. The authors confirm previous work that actin-based processes contribute to virus transport to the cell body. The superresolution microscopy methods and data collection are state-of-the art and provide an interesting new way of analysing the interaction with host cell proteins on the cell surface in certain infection scenarios. The proposed hypothesis is interesting and, if substantiated, could significantly advance the field.

      Weaknesses:

      As a study design, the authors use infection of HaCaT keratinocytes, and follow virus localisation with and without inhibition of actin polymerisation by cytochalasin D (cytoD) to analyse transfer of virions from the ECM to the cell by filopodial structures using important cellular proteins for cell entry as markers.

      First, the data is mostly descriptive besides the use of cytoD, and does not test the main claim of their model, in which virions that are still bound to heparan sulfate proteoglycans are transferred by binding to tetraspanins along filopodia to the cell body.

      Second, using cytoD is a rather broad treatment that not only affects actin retrograde flow, but also virus endocytosis and further vesicular transport in cells, including exocytosis. Inhibition of myosin II, e.g., by blebbistatin, would have been a better choice as it, for instance, does not interfere with endocytosis of the virus.

      Third, the authors aim to study transfer from ECM to the cell body and the effects thereof. However, there are substantial, if not the majority of, viruses that bind to the cell body compared to ECM-bound viruses in close vicinity to the cells. This is in part obscured by the small subcellular regions of interest that are imaged by STED microscopy, or by the use of plasma membrane sheets. As a consequence, the obtained data from time point experiments is skewed, and remains for the most part unconvincing due to the fact that the origin of virions in time and space cannot be taken into account. This is particularly important when interpreting association with HS, the tetraspanin CD151, and integral alpha 6, as the low degree of association could originate from cell-bound and ECM-transferred virions alike.

      Fourth, the use of fixed images in a time course series also does not allow for understanding the issue of a potential contribution of cell membrane retraction upon cytoD treatment due to destabilisation of cortical actin. Or, of cell spreading upon cytoD washout. The microscopic analysis uses an extension of a plasma membrane stain as a marker for ECM-bound virions, which may introduce a bias and skew the analysis.

      Fifth, while the use of randomisation during image analysis is highly recommended to establish significance (flipping), it should be done using only ROIs that have a similar density of objects for which correlations are being established. For instance, if one flips an image with half of the image showing the cell body, and half of the image ECM, it is clear that association with cell membrane structures will only be significant in the original. I am rather convinced that using randomisation only on the plasma membrane ROIs will not establish any clear significance of the correlating signals. Also, there should be a higher n for the measurements.

    4. Author response:

      Reviewer #1 (Public review):

      The authors' goal was to arrest PsV capsids on the extracellular matrix using cytochalasin D. The cohort was then released, and interaction with the cell surface, specifically with CD151, was assessed.

      The model that fragmented HS associated with released virions mediates the dominant mechanism of infectious entry has only been suggested by research from a single laboratory and has not been verified in the 10+ years since publication. The authors are basing this study on the assumption that this model is correct, and these data are referred to repeatedly as the accepted model despite much evidence to the contrary.

      Please note that we state in the introduction on line 65/66 ´Two release mechanisms are discussed, that mutually are not exclusive´. This is implying that we do not consider the shedding model as the one accepted model. HS may associate with PsVs despite of a decreased affinity and only after priming (see below the ‘priming model’) may translocate to the cell body.

      Furthermore, we do not state in the discussion either that the shedding model is the preferred one; although it is correct that we refer to the shedding model more extensively, simply because we find HS associated with transferred PsVs, which is in line with this model and requires its citation.

      The discussion in lines 65-71 concerning virion and HSPG affinity changes is greatly simplified. The structural changes in the capsid induced by HS interaction and the role of this priming for KLK8 and furin cleavage have been well researched. Multiple laboratories have independently documented this. If this study aims to verify the shedding model, additional data need to be provided.

      As outlined above, our finding is compatible with both models, and we do not aim to verify the shedding model or disprove the priming model.

      It appears that the referee wishes more visibility of the priming model. Inhibition of KLK8 and furin should reduce the translocation to the cell body, no matter whether PsVs carry HS on their surface or not. For revision, we plan an experiment as in Figure 3 (CytD), testing whether either KLK8 or furin inhibition blocks the transfer to the cell body. Then, our data can be discussed also in the context of the priming model and by this increase its visibility.

      The model should be fitted into established entry events, or at minimum, these conflicting data, a subset of which is noted below, need to be acknowledged.

      (1) The Sapp lab (Richards et al., 2013) found that HSPG-mediated conformational changes in L1 and L2 allowed the release of the virus from primary binding and allowing secondary receptor engagements in the absence of HS shedding.

      (2) Becker et al. found that furin-precleaved capsids could infect cells independently of HSPG interaction, but this infection was still inhibited with cytochalasin D.

      (3) Other work from the Schelhaas lab showed that cytochalasin D inhibition of infection resulted in the accumulation of capsids in deep invaginations from the cell surface, not on the ECM

      (4) Selinka et al., 2007, showed that preventing HSPG-induced conformational changes in the capsid surface resulted in noninfectious uptake that was not prevented with cytochalasin D.

      (5) The well-described capsid processing events by KLK8 and furin need to be mechanistically linked to the proposed model. Does inhibition of either of these cleavages prevent engagement with CD151?

      The authors need to consider an explanation for these discrepancies.

      That PsVs carry HS-cleavage products doesn´t imply that HS cleavage is sufficient or required for infection. Therefore, we do not view our data as being in conflict with the priming model. In fact, our observations are compatible with aspects of both the shedding and the priming model.

      Yet, we acknowledge that the study would gain importance by directly testing the priming model within our experimental system. As requested by the referee, we will discuss the above papers, and further plan to test KLK8 and furin inhibitors.

      Other issues:

      (1) Line 110-111. The statement about PsVs in the ECM being too far away from the cell surface to make physical contact with the cell surface entry receptors is confusing. ECM binding has not been shown to be an obligatory step for in vitro infection.

      Not obligatory, but strongly supportive (Bienkowska-Haba et al., Plos Path., 2018; Surviladze et al., J. Gen. Viro., 2015). As recently published by the Sapp lab (Bienkowska-Haba et al., Plos Path., 2018), ´Direct binding of HPV16 to primary keratinocytes yields very inefficient infection rates for unknown reasons.´ Moreover, the paper shows that HaCaT cell ECM binding of PsVs increases the infection of NHEK by 10-fold and of HFK by almost 50-fold.

      This idea is referred to again on lines 158-159 and 199. The claim (line 158) that PsV does not interact with the cell within an hour needs to be demonstrated experimentally and seems at odds with multiple laboratories' data. PsV has been shown to directly interact with HSPG on the cell surface in addition to the ECM. Why are these PsVs not detected?

      We do not question that in many cellular systems PsVs interact with heparan sulfate proteoglycans (HSPGs) present on the cell surface, or both on the cell surface and the ECM. We stated in the manuscript on line 59 ´While in cell culture virions bind to HS of the cell surface and the ECM, it has been suggested that in vivo they bind predominantly to HS of the extracellular basement membrane (Day and Schelhaas, 2014; Kines et al., 2009; Schiller et al., 2010).´

      Moreover, we ourselves detect these PsVs, for example, in Figure 5A (CytD, 0 min time point), a handful of PsVs localize to the cell body area. However, the large majority overlaps with the strong HS staining at the cell periphery, likely the ECM. An accurate quantification of the fractions of PsVs bound to the ECM/cell body is for the following reasons very difficult. First, the ECM PsVs are very dense and therefore not microscopically resolved into single PsVs, at least not completely (see Figure 1C; the high intensity spots are non-resolved PsVs, please see our discussion on line 148 - 152). For this reason, by just counting spots we strongly underestimate the ECM PsVs versus the cell body PsVs. Second, with the available immunostainings we cannot exactly delineate the ECM from the cell body. In particular, at the cell border region (for example see Figure 4B) we often observe PsV accumulations. Assigning these ´cell border region PsVs´ entirely to the cell body fraction, a preliminary analysis (correcting for the limitation of non-resolved ECM PsVs) suggests that about a quarter of the PsVs bind to the cell body. On the other hand, assigning them to the ECM, the cell body fraction would be much below 10%. Third, we observe that in regions devoid of ECM and cells PsVs apparently adhere unspecifically to the glass-coverslip. This suggests that some of the cell body PsVs are just unspecific background. Subtraction of a background PsV density from the ECM and cell body PsV density will reduce relatively more the cell body PsVs, and consequently decreases the fraction of cell body PsVs even more.

      Moreover, in the course of the project we wondered whether at the basolateral membrane there are not many binding sites anyway. To address this question, in an unpublished experiment, we detached HaCaT cells with trypsin, incubated them with PsVs, and then allowed reattachment to assess the binding in suspension. We detected minimal to no binding, which, however, could also result from apical membrane adherence to the coverslip or trypsin-mediated cleavage of HSPGs. As suggested by the reviewing editor, we agree that repeating this experiment using EDTA for detachment—thus preserving HSPGs—would offer more definitive insight into binding efficiency in the absence of accessibility constraints. In summary, the reason why in our cellular system most PsVs do not bind to the cell surface could be a combination of several factors:

      (1) The primary binding partners are more abundant in the ECM and the polarized HaCaT cells secrete more ECM when compared to other cultured cells used to study HPV infection. This promotes ECM binding.

      (2) In the polarized HaCaT cells, the apical membrane is largely devoid of syndecan-1, CD151 and Itga6, wherefore PsVs infect the cell via the basolateral membrane. However, the accessibility to the basolateral membrane is restricted, PsVs must diffuse through a narrow slit between the glass coverslip and the attached cell to reach HS on the cell surface. This limits cell surface binding.

      (3) If HaCaT cells secrete large amounts of ECM, the may become depleted from cell surface HS. As outlined above, we will try to find out how many PsVs bind to the basolateral membrane in the absence of restricted accessibility. If it turns out that HaCaT cells have not many binding sites anyway, this would additionally promote binding to the ECM.

      The outcome of the above issues, and how we will mention them in the revised version of the manuscript, is open. In any case, we would like to point out that PsVs bound to the cell body do not weaken our main conclusion. Still, we recognize that this point merits attention and plan several modifications of the manuscript. We did already, but now we will mention more explicitly that PsVs have been shown to directly interact with HSPG on the cell surface, in addition to the ECM, but that it also has been shown that the ECM strongly supports infection in NHEK and HFK (Bienkowska-Haba et al., Plos Path., 2018). The following is a draft version of a paragraph we plan to incorporate, explaining the above issue and why we used in our experiments HaCaT cells:

      ´In vitro, PsVs bind to both the cell surface and the ECM, as has been widely documented. In vivo, however, it has been proposed that initial binding occurs predominantly to the basement membrane ECM, rather than directly to the cell surface (Day and Schelhaas, 2014; Kines et al., 2009; Schiller et al., 2010). This distinction reinforces the physiological relevance of ECM-bound particles in the early steps of HPV infection. Support for a functional role of ECM-mediated entry comes from a study showing that PsV binding to ECM derived from HaCaT cells significantly enhances infection of primary keratinocytes (Bienkowska-Haba et al., 2018). For these reasons, we specifically chose polarized HaCaT cells as a model system. These cells secrete abundant ECM from which the cells readily collect bound PsVs. On the other hand, the polarization limits the access of PsVs to basolateral receptors such as CD151 and Itgα6, and also cell body resident Syndecan-1, the most abundant HSPG in keratinocytes (Rapraeger et al., 1986; Hayashi et al., 1987; Kim et al., 1994). Hence, as polarization limits direct cell surface accessibility it biases binding toward the ECM, that in this culture system is abundant. Hence, in the HaCaT cell culture system, like probably in vivo, PsVs cannot circumvent binding to the ECM what they can do in unpolarized cell cultures that may not even secrete significant amounts of ECM. Altogether, this experimental situation closely mimics the in vivo situation where PsVs bind preferentially to the ECM (Day and Schelhaas, 2014; Kines et al., 2009; Schiller et al., 2010).´

      We appreciate the reviewer’s input and believe these additions will strengthen the manuscript with regard to the relevance of the used cellular model system.

      (2) The experiments shown in Figure 5 need to be better controlled. Why is there no HS staining of the cell surface at the early timepoints? This antibody has been shown to recognize N-sulfated glucosamine residues on HS and, therefore, detects HSPG on the ECM and cell surface.

      We have shown all images at the same adjustments of brightness and contrast. As the staining at the periphery is stronger, the impression is given that the cell surface is not stained, although there is some staining. Specific staining is documented in Figure 5D, showing the PCC between PsVs and HS only of the cell body. If there was no HS staining, the PCC would be zero, which is not the case. Yet, it is lower when compared to the PCC measured at the cell border region, with more strongly stained HS.

      We will provide images at different contrast and brightness adjustments enabling the reader to see the staining on the cell surface. We will provide also more overview images to illustrate the strong variability of the HS staining between cells.

      Therefore, the conclusion that this confirms HS coating of PsV during release from the ECM (line 430-431) is unfounded. How do the authors distinguish between "HS-coated virions" and HSPG-associated virions?

      The HS intensity transiently increases on the cell body (Fig. 5D) only after releasing a cohort of PsVs, which can be only explained by PsVs that carry HS from the ECM to the cell body. However, the effect is not significant. Using the antibody 3G10 detecting the HS neoepitope (see the referees’ suggestion below) we will reanalyze this point. This should help clarifying the issue.

      It is difficult to comprehend how the addition of 50 vge/cell of PsV could cause such a global change in HS levels.

      The distribution of bound PsVs largely varies between cells. Some areas are covered with essentially confluent cells, to which hardly any PsVs are bound, because accessing the basolateral membrane of confluent cells is nearly impossible, and PsVs do not bind to the exposed apical membrane. This is different in cultures of unpolarized cells where we expect that PsVs distribute more equally over cells.

      This means that in our experiments the vge/cell is not a suitable parameter for relating the magnitude of an effect to a defined number of PsVs. In the ECM, the PsV density is very high, enabling one cell to collect several hundred PsVs, much more than expected from the 50 vge/cell. We will point this out in the revised version.

      The claim that the HS levels are decreased in the non-cytochalasin-treated cells due to PsV-induced shedding needs to be demonstrated.

      We did not claim that PsVs induce shedding, we rather believe they just take shedded HS with them. Without PsVs, the shedded HS likely remains in the ECM or is washed out very slowly.

      If HS is actually shed, staining of the cell periphery could increase with the antibody 3G10, which detects the HS neoepitope created following heparinase cleavage.

      As outlined above, we plan to test the suggested antibody 3G10. We also plan to repeat the 0 min time point (with and without PsVs, with and without CytD) to find out whether in the PsV absence the HS intensity (at 0 min) is unchanged between control and CytD.

      Reviewer #2 (Public review):

      Summary:

      Massenberg and colleagues aimed to understand how Human papillomavirus particles that bind to the extracellular matrix (ECM) transfer to the cell body for later uptake, entry, and infection. The binding to ECM is key for getting close to the virus's host cell (basal keratinocytes) after a wounding scenario for later infection in a mouse vaginal challenge model, indicating that this is an important question in the field.

      Strengths:

      The authors take on a conceptually interesting and potentially very important question to understand how initial infection occurs in vivo. The authors confirm previous work that actin-based processes contribute to virus transport to the cell body. The superresolution microscopy methods and data collection are state-of-the art and provide an interesting new way of analysing the interaction with host cell proteins on the cell surface in certain infection scenarios. The proposed hypothesis is interesting and, if substantiated, could significantly advance the field.

      Weaknesses:

      As a study design, the authors use infection of HaCaT keratinocytes, and follow virus localisation with and without inhibition of actin polymerisation by cytochalasin D (cytoD) to analyse transfer of virions from the ECM to the cell by filopodial structures using important cellular proteins for cell entry as markers.

      First, the data is mostly descriptive besides the use of cytoD, and does not test the main claim of their model, in which virions that are still bound to heparan sulfate proteoglycans are transferred by binding to tetraspanins along filopodia to the cell body.

      The study identifies a rapid translocation step from the ECM to the cell body. We have no data that demonstrates a physical interaction between PsVs and CD151. In the model figure, we draw CD151 as part of the secondary receptor complex. We are sorry for having raised the impression that PsVs would bind directly to CD151 and will rephrase the respective section.

      Second, using cytoD is a rather broad treatment that not only affects actin retrograde flow, but also virus endocytosis and further vesicular transport in cells, including exocytosis. Inhibition of myosin II, e.g., by blebbistatin, would have been a better choice as it, for instance, does not interfere with endocytosis of the virus.

      We agree, and plan to test whether blebbistatin is equally efficient in blocking the transfer.

      Third, the authors aim to study transfer from ECM to the cell body and the effects thereof. However, there are substantial, if not the majority of, viruses that bind to the cell body compared to ECM-bound viruses in close vicinity to the cells.

      We agree that in multiple cell culture systems viruses bind preferentially to the cell directly. But we respectfully disagree with the assertion that the majority of PsVs bind to the cell body of HaCaT keratinocytes. As noted above (e.g., Figure 5A, CytD, 0 min), only a small fraction of PsVs localize to the cell body, whereas the vast majority overlap with intense HS staining at the cell periphery, consistent with ECM association, as the accessibility to the basolateral expressed HSPG is limited (see above). Based on quantitative estimation from multiple images, ECM-bound PsVs largely outnumber cell-bound particles (see above). These features make HaCaT cells a suitable in vitro model for mimicking in vivo conditions, where HPV has been proposed to bind predominantly to the basement membrane ECM rather than the cell surface (Day and Schelhaas, 2014; Kines et al., 2009; Schiller et al., 2010) which also strongly enhances infection of primary keratinocytes in vitro (Bienkowska-Haba et al., 2018).

      Thus, we believe our system appropriately models the physiologically relevant scenario of ECM-to-cell transfer, and the observed predominance of ECM binding supports the validity of our experimental focus.

      This is in part obscured by the small subcellular regions of interest that are imaged by STED microscopy, or by the use of plasma membrane sheets. As a consequence, the obtained data from time point experiments is skewed, and remains for the most part unconvincing due to the fact that the origin of virions in time and space cannot be taken into account. This is particularly important when interpreting association with HS, the tetraspanin CD151, and integral alpha 6, as the low degree of association could originate from cell-bound and ECM-transferred virions alike.

      As stated above, we observe massive binding of PsVs to the ECM, in contrast to very few PsVs that diffuse beneath the basolateral membrane of the polarized HaCaT cells and do bind directly to the cell surface (or maybe they are simply trapped between glass and basolateral membrane). PsVs are not expected to bind to the apical membrane that is depleted from CD151 and Itga6. In other cellular systems, cells may hardly secrete ECM, are not polarized, and do not adhere so tightly to the substrate. In other cultures, where virions can easily circumvent ECM binding, the large majority of PsVs will likely bind directly to the cell surface.

      As outlined above, in order to quantify PsVs that can bind without restricted accessibility, we plan to detach HaCaT cells by EDTA from the substrate, incubate them with PsVs, and let them adhere again (please see above).

      No matter what is the outcome, the fraction of PsVs that binds directly to the cell surface does not weaken our conclusion that we have identified a very fast and efficient transfer step from the ECM to the cell body.

      Fourth, the use of fixed images in a time course series also does not allow for understanding the issue of a potential contribution of cell membrane retraction upon cytoD treatment due to destabilisation of cortical actin. Or, of cell spreading upon cytoD washout.

      If blebbistatin works as expected, we can safely conclude that we observe the very same process as described in Scheelhas et al., PLoS Pathogens, 2008, showing that the PsVs migrate by retrograde transport to the cell surface and not that the cell spreads out and by this reaches the PsVs.

      The microscopic analysis uses an extension of a plasma membrane stain as a marker for ECM-bound virions, which may introduce a bias and skew the analysis.

      Our plasma membrane stain does not stain the ECM. Please see Figure 1. The stain is actually used to distinguish the cell body from the ECM area.

      Fifth, while the use of randomisation during image analysis is highly recommended to establish significance (flipping), it should be done using only ROIs that have a similar density of objects for which correlations are being established.

      We agree that the way of how randomization is done is very important. Regarding the association of PsVs with CD151 and HS, based on flipped images, we generated a calibration curve used for the correction of random background. For details, please see Supplementary Figures 3 and 5.

      For instance, if one flips an image with half of the image showing the cell body, and half of the image ECM, it is clear that association with cell membrane structures will only be significant in the original. I am rather convinced that using randomisation only on the plasma membrane ROIs will not establish any clear significance of the correlating signals.

      Figure 5D shows the PCC specifically of the cell body. In flipped images (not shown in the manuscript for clarity, but can be added) we obtain a PCC of around zero.  For CytD, the flipped images always have a significantly lower PCC compared to the original images. In the control, the PCC of the flipped images are significantly lower only for the 30 min and 60 min time point. The non-significance of the 0 min and 180 min time point is due to low PCCs also in the original images.

      Also, there should be a higher n for the measurements.

      One n is the average of 15 cells. We realize that with n = 3 we find significant effects only if the effect is very strong or moderate with very low variance.

    1. eLife Assessment

      This valuable study provides outlines the mechanism by which repeated vaccination broadens the breadth of antibody responses against epitope unmatched virus strains. The authors' mathematical model is solid and incorporates various parameters that regulate B cell activation and antibody response.

    2. Reviewer #1 (Public Review):

      In this study, Deng et al. investigate the antibody response against HA antigen following repeated vaccination with the H1N1 2009 pandemic influenza vaccine strain, using in silico modeling. The proposed model provides valuable mechanistic insights into how the broadening of the antibody response takes place upon repeated vaccination.

      Overall, the authors' model effectively explains the mechanistic principles underlying antibody responses against the viral antigens harboring epitope immunodominancy.

    3. Reviewer #2 (Public Review):

      The authors have been studying the mechanism of breadth expansion in antibody responses with repeated vaccinations using their own mathematical model. In this study, they applied this mathematical model to a cohort data analyzing anti-HA antibody responses after multiple influenza virus vaccination and investigated the mechanism of antibody breadth expansion to diversified target viral strains.<br /> The manuscript is well written, and the mathematical model is well built that incorporates various parameters related to B cell activation in GC and EGC based on experimental data.

      Strengths:

      By carefully reanalyzing the published cohort data (Nunez IA et al 2017 PLoS One), they have clearly demonstrated that the repeated influenza virus vaccinations result in an expansion of the breadth to unmatched viral strains.

      Using their mathematical model, they have determined the major factors for the breadth expansion following multiple immunizations.

      Weaknesses:

      The overall concept of their model has already been published (Yang L et al 2023 Cell Reports) with a SRAS-CoV-2 vaccine model, and they have applied it to influenza virus vaccine in this study, with the conclusions being largely the same.

      It is unclear how the re-evaluation of public data in the first half part is related to the validation of their model in the later part.

      Other points:

      In the original data by Nurez LA et al., HAI (the inhibitory effect of anti-HA antibodies on the binding of HA to sialic acid on erythrocytes) was used as the lead-out. The authors conclude that the breadth expansion with repeated vaccinations is primarily due to the activation of B cells with BCRs that recognize minor common epitopes, induced by covering up of strain specific major epitopes by pre-existing antibodies. However, as they themselves show in Fig 1, once the sialic acid-binding region is covered, it seems difficult for another BCR to bind to this region. When the target epitope is limited like this, the effect of increasing antigen supply to DCs by pre-existing antibodies and the effect of increasing the presentation of minor epitopes appears to compete with each other. Could the author please explain this point? In relation to this point, please explain the meaning of analysis of the entire ectodomain when the original data's lead-out is HAI.

      Minor point:

      The description "The purpose of this model is ...." starting at line 171 and the description of "we obtain results in harmony with the clinical findings ...." starting at line 478 sound to be contradictory. As the authors themselves state at line 171, if the purpose of this model is not to fit the data but to demonstrate the principle, then the prudent sampling and reanalyzing data itself seems to have less meaning.

    4. Author response:

      Reviewer #1 (Public Review):

      In this study, Deng et al. investigate the antibody response against HA antigen following repeated vaccination with the H1N1 2009 pandemic influenza vaccine strain, using in silico modeling. The proposed model provides valuable mechanistic insights into how the broadening of the antibody response takes place upon repeated vaccination.

      Overall, the authors' model effectively explains the mechanistic principles underlying antibody responses against the viral antigens harboring epitope immunodominancy.

      We thank the Reviewer for their positive and thoughtful assessment of the work. We address issues raised in the revised manuscript and in the point-by-point responses below.

      Reviewer #2 (Public Review):

      The authors have been studying the mechanism of breadth expansion in antibody responses with repeated vaccinations using their own mathematical model. In this study, they applied this mathematical model to a cohort data analyzing anti-HA antibody responses after multiple influenza virus vaccination and investigated the mechanism of antibody breadth expansion to diversified target viral strains.

      The manuscript is well written, and the mathematical model is well built that incorporates various parameters related to B cell activation in GC and EGC based on experimental data.

      We thank the reviewer for their positive and thoughtful review and address issues raised in a revised version of the manuscript and in the point-by-point below.

      Strengths:

      By carefully reanalyzing the published cohort data (Nunez IA et al 2017 PLoS One), they have clearly demonstrated that the repeated influenza virus vaccinations result in an expansion of the breadth to unmatched viral strains.

      Using their mathematical model, they have determined the major factors for the breadth expansion following multiple immunizations.

      We thank the reviewer for pointing out the strengths of our study.

      Weaknesses

      The overall concept of their model has already been published (Yang L et al 2023 Cell Reports) with a SARS-CoV-2 vaccine model, and they have applied it to influenza virus vaccine in this study, with the conclusions being largely the same.

      It is unclear how the re-evaluation of public data in the first half part is related to the validation of their model in the later part.

      The reviewer is correct in that we build directly on our model published previously to study related phenomena for SARS-CoV-2. However, a critical advance of the work was to now ask whether antibody broadening following repeated homologous antigen exposure is a general feature of human humoral immunity. As we point out in the introduction of our manuscript, repeated exposure to the same antigen has long been assumed to predominantly boost strain limited humoral immunity, necessitating rational design of vaccines that re-orient antibody responses to target otherwise immune-subdominant targets. Hence, antibody broadening in response to homologous SARS-CoV-2 antigen points to reconsideration of that basic premise in immunology; and if we are to now define this as general feature of human antibody responses, then evaluation of the principle using a different vaccine protocol and antigen is necessitated. Accordingly, we took advantage of the influenza vaccine space where, within the immediate years following the 2009 H1N1 pandemic, the 2009 H1N1 strain was repeatedly applied as the seasonal vaccine strain. This HA was also novel (as it was from a pandemic virus pHA), meaning that traditional back-boosting to historical strains would be limited. We then re-evaluated the longitudinal HAI data of Nurez et al. to define whether a broadening to increasingly divergent vaccine-unmatched strains is observed upon repeated exposure to pHA. This was not done before and was enabled by incorporating our amino acid relatedness parameter and our structure-based definition of the RBS patch. To then query mechanistic origins of the broadening effect, we adapted and extended our previous computational model to: (1) better reflect HA epitope diversity and overlap within the RBS patch; and (2) to better reflect the influenza immunization regimens that are used clinically. The differences between the modeling done in this paper and that in Yang et al. 2023 are described in the Methods section separately. Taken together, our analyses of data in Nunez et al and our simulations strengthen the emerging view that repeated boosting with the same antigen enables the humoral immune system to diversify immune responses because of feedback regulation which leads to enhanced antigen on FDCs, persistent GCs, and epitope masking. This, in turn, enables the immune system to generalize to recognize and respond to unseen variant antigens that harbor mutations in the immunodominant epitopes. Our results point to a new and emerging paradigm regarding booster immunizations and fundamental features of the humoral immune system.

      Other points:

      In the original data by Nurez LA et al., HAI (the inhibitory effect of anti-HA antibodies on the binding of HA to sialic acid on erythrocytes) was used as the lead-out. The authors conclude that the breadth expansion with repeated vaccinations is primarily due to the activation of B cells with BCRs that recognize minor common epitopes, induced by covering up of strain specific major epitopes by pre-existing antibodies. However, as they themselves show in Fig 1, once the sialic acid-binding region is covered, it seems difficult for another BCR to bind to this region. When the target epitope is limited like this, the effect of increasing antigen supply to DCs by pre-existing antibodies and the effect of increasing the presentation of minor epitopes appears to compete with each other. Could the author please explain this point?

      We agree that accounting for epitope overlap is important when the target is limited, as the reviewer indicates. In Figure 6C vs 6D we assess steric effects of possible spatial overlap between dominant and subdominant epitopes. Under overlapping conditions, we find evidence for steric-based constrainment of broadening, as predicted by the reviewer. Depending upon the degree of overlap between the epitopes and differences in germline characteristics in the B cells targeting dominant and subdominant epitopes, this effect could be compensated during subsequent shots, as described by our results (see lines 392-406).

      We also now incorporate the following sentence into our discussion (lines 448-453):

      “Epitope masking will also be constrained by the dimensions of the RBS and our simulations do report attenuation of titers against historical influenza strains when we introduce epitope overlap. Depending upon the degree of overlap between the epitopes and differences in germline characteristics in the B cells targeting dominant and subdominant epitopes, this effect could be compensated during subsequent shots.”

      In relation to this point, please explain the meaning of analysis of the entire ectodomain when the original data's lead-out is HAI.

      We include side-by-side full length ectodomain versus RBS patch (sialic acid binding residues + antibody epitope ring) to demonstrate relatedness differences in the lead-out data. But it is precisely because of the point raised by the reviewer that we focus on using the RBS patch as the relatedness values to assess antibody broadening as defined by HAI activity (see Figure 3 and S2). 

      Minor point:

      The description "The purpose of this model is ...." starting at line 171 and the description of "we obtain results in harmony with the clinical findings ...." starting at line 478 sound to be contradictory. As the authors themselves state at line 171, if the purpose of this model is not to fit the data but to demonstrate the principle, then the prudent sampling and reanalyzing data itself seems to have less meaning.

      We respectfully disagree. Please see above point as to how the clinical data is more than just “reanalyzing” but to first discover the previously unreported broadening effect across highly divergent strains following sequential immunization with homologous antigen in the influenza vaccine space; we then extended and adapted our computational model for the influenza vaccination paradigm to gain mechanistic insight on how such antibody broadening may occur. The word “harmony” was not meant to imply quantitative agreement, and apologize if it caused confusion.

    1. eLife Assessment

      This important study by Wu et al presents convincing data on bacterial cell organization, demonstrating that the two structures that account for bacterial motility - the chemotaxis complex and the flagella - colocalize to the same pole in Pseudomonas aeruginosa cells, and expose the regulation underlying their spatial organization and functioning. This manuscript will be of interest to cell biologists, primarily those studying bacteria.

    2. Reviewer #1 (Public review):

      Summary:

      The study by Wu et al presents interesting data on bacterial cell organization, a field that is progressing now, mainly due to the advances in microscopy. Based mainly on fluorescence microscopy images, the authors aim to demonstrate that the two structures that account for bacterial motility, the chemotaxis complex and the flagella, colocalize to the same pole in Pseudomonas aeruginosa cells and to expose the regulation underlying their spatial organization and functioning.

      Comments on revisions:

      The authors have addressed all major and minor points that I raised in a satisfying way during the revision process. The work can now be regarded as complete: , the assumptions were clarified, the results are convincing, the conclusions are justified, and the novelty has been made clear. This manuscript will be of interest to cell biologists, mainly those studying bacteria, but not only

    3. Reviewer #2 (Public review):

      Summary:

      Here, the authors studied the molecular mechanisms by which the chemoreceptor cluster and flagella motor of Pseudomonas aeruginosa (PA) are spatially organized in the cell. They argue that FlhF is involved in localizing the receptors and motor to the cell pole, but a separate mechanism colocalizes them. Finally, the authors argue that the functional reason for this colocalization is to insulate chemotactic signaling from other signaling pathways, such as cyclic-di-GMP signaling.

      Strengths:

      The experiments and data are high quality. It is clear that the motor and receptors co-localize, and that elevated CheY levels lead to elevated c-di-GMP. The signaling crosstalk argument is plausible.

    4. Reviewer #3 (Public review):

      Summary:

      The authors investigated the assembly and polar localization of the chemosensory cluster in P. aeruginosa. They discovered that a certain protein (FlhF) is required for the polar localization of the chemosensory cluster while core motor structures are necessary for the assembly of the cluster. They found that flagella and chemosensory clusters always co-localize in the cell; either at the cell pole in wild type cells or randomly-located in the cell in FlhF mutant cells. They hypothesize that this co-localization is required to keep the level of another protein (CheY-P), which controls motor switching, at low levels as the presence of high-levels of this protein (if the flagella and chemosensory clusters were not co-localized) is associated with high-levels of c-di-GMP and cell aggregations.

      Strengths:

      The manuscript is clearly-written and straightforward. The authors applied multiple techniques to study the bacterial motility system including fluorescence light microscopy and gene editing. In general, the work enhances our understanding of the subtlety of interaction between the chemosensory cluster and the flagellar motor to regulate cell motility. This work will be of interest to bacteriologists and cell biologists in general.

    5. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public review):

      Summary:

      The study by Wu et al presents interesting data on bacterial cell organization, a field that is progressing now, mainly due to the advances in microscopy. Based mainly on fluorescence microscopy images, the authors aim to demonstrate that the two structures that account for bacterial motility, the chemotaxis complex and the flagella, colocalize to the same pole in Pseudomonas aeruginosa cells and to expose the regulation underlying their spatial organization and functioning.

      Comments on revisions:

      The authors have addressed all major and minor points that I raised in a satisfying way during the revision process. The work can now be regarded as complete, the assumptions were clarified, the results are convincing, the conclusions are justified, and the novelty has been made clear.

      This manuscript will be of interest to cell biologists, mainly those studying bacteria, but not only.

      Reviewer #2 (Public review):

      Summary:

      Here, the authors studied the molecular mechanisms by which the chemoreceptor cluster and flagella motor of Pseudomonas aeruginosa (PA) are spatially organized in the cell. They argue that FlhF is involved in localizing the receptors-motor to the cell pole, and even without FlhF, the two are colocalized. Finally, the authors argue that the functional reason for this colocalization is to insulate chemotactic signaling from other signaling pathways, such as cyclic-di-GMP signaling.

      Strength:

      The experiments and data are high quality. It is clear that the motor and receptors co-localize, and that elevated CheY levels lead to elevated c-di-GMP.

      Weakness:

      The explanation for the functional importance of receptor-motor colocalization is plausible but is still not conclusively demonstrated. Colocalization might reduce CheY levels throughout the cell in order to reduce cross-talk with c-di-GMP. This would mean that if physiologically-relevant levels of CheYp near the pole were present throughout the cell, c-di-GMP levels would be elevated to a point that is problematic for the cell. Clearly demonstrating this seems challenging.

      We acknowledge that directly proving the necessity of colocalization to prevent problematic c-di-GMP elevation is experimentally challenging, as it would require creating a system where CheY-P is artificially distributed throughout the cell at physiologically relevant concentrations while maintaining normal chemotaxis function.

      However, our data provide several lines of evidence supporting this model. First, we show that CheY overexpression leads to substantial c-di-GMP elevation (71.8% increase) and cell aggregation, demonstrating that elevated CheY levels can indeed cause problematic cross-pathway interference. Second, previous work has shown that CheY-P levels near the pole are an order of magnitude higher than in the rest of the cell (ref. 46). If this elevated CheY-P concentration near the pole were present throughout the cell, our data suggest that c-di-GMP levels would be elevated sufficiently to cause cell aggregation (Fig. 4A), thereby disabling normal motility and chemotaxis. Third, the dose-dependent relationship between CheY concentration and aggregation phenotype supports the idea that precise spatial regulation of CheY levels is functionally important for avoiding cross-pathway interference.

      Reviewer #3 (Public review):

      Summary:

      The authors investigated the assembly and polar localization of the chemosensory cluster in P. aeruginosa. They discovered that a certain protein (FlhF) is required for the polar localization of the chemosensory cluster while a fully-assembled motor is necessary for the assembly of the cluster. They found that flagella and chemosensory clusters always co-localize in the cell; either at the cell pole in wild type cells or randomly-located in the cell in FlhF mutant cells. They hypothesize that this co-localization is required to keep the level of another protein (CheY-P), which controls motor switching, at low levels as the presence of high-levels of this protein (if the flagella and chemosensory clusters were not co-localized) is associated with high-levels of c-di-GMP and cell aggregations.

      Strengths:

      The manuscript is clearly written and straightforward. The authors applied multiple techniques to study the bacterial motility system including fluorescence light microscopy and gene editing. In general, the work enhances our understanding of the subtlety of interaction between the chemosensory cluster and the flagellar motor to regulate cell motility.

      Weaknesses:

      The major weakness for me in this paper is that the authors never discussed how the flagellar genes expression is controlled in P. aeruginosa. For example, in E. coli there is a transcriptional hierarchy for the flagellar genes (early, middle, and late genes, see Chilcott and Hughes, 2000). Similarly, Campylobacter and Helicobacter have a different regulatory cascade for their flagellar genes (See Lertsethtakarn, Ottemann, and Hendrixson, 2011). How does the expression of flagellar genes in P. aeruginosa compare to other species? how many classes are there for these genes? is there a hierarchy in their expression and how does this affect the results of the FliF and FliG mutants? In other words, if FliF and FliG are in class I (as in E. coli) then their absence might affect the expression of other later flagellar genes in subsequent classes (i.e., chemosensory genes). Also, in both FliF and FliG mutants no assembly intermediates of the flagellar motor are present in the cell as FliG is required for the assembly of FliF (see Hiroyuki Terashima et al. 2020, Kaplan et al. 2019, Kaplan et al. 2022). It could be argued that when the motor is not assembled then this will affect the expression of the other genes (e.g., those of the chemosensory cluster) which might play a role in the decreased level of chemosensory clusters the authors find in these mutants.

      We thank the reviewer for the valuable suggestions. In the revised manuscript, we have further elaborated on the regulatory control of flagellar genes expression in P. aeruginosa (see our response to comment #4).

      Comments on revisions:

      I believe the authors have performed additional experiments that improved their manuscript and they have answered many of my comments and those of the other reviewers. I am supportive of publishing this manuscript, but I still find the following points that are not clear to me (probably I am misunderstanding some points; the authors can clarify).

      (1) In response to reviewer 1, the authors say that they "analyzed and categorized the distribution of the chemotaxis complex in both wild-type and flhF mutant strains into three patterns: precise-polar, near-polar, and mid-cell localization." I can see what they mean by polar and mid-cell, but near-polar sounds a bit elusive? Can they provide examples of this stage and mention how accurately they can identify it? Also, do the pie charts they show in Figure S4 really show "significant alterations"? There is a difference between 98% and 85% as they mention in their response to reviewer 1, but I am not sure that this is significant? Probably they can explain/change the language in the text? Also, the number of cells they counted for FlhF mutant is more than the double of other strains (WT and FlhF FliF mutant)?

      We thank the reviewer for the valuable suggestions. To clarify, we divided the intracellular area along the cell's long axis into three domains: the two ends each representing 10% of the length as the precise-polar domain, the central 50% as the mid-cell domain, and the remaining regions between these as the near-polar domain. The localization pattern of the chemotaxis complex was assigned based on the position of the fluorescence intensity centroid within these domains.

      Regarding the significance of the changes, you are correct to question our language. When flhF was knocked out, the proportion of chemotaxis complexes with precise-polar distribution decreased from 98% to 85% - a 13% reduction. While this represents a measurable shift in localization pattern, describing this as "significant alterations" was probably imprecise. We have revised this language to more accurately reflect the magnitude of the change (lines 169-177).

      For the cell counting, we increased the sample size for the flhF mutant because this strain exhibited the appearance of mid-cell localization (approximately 5% of cells), which was not observed in wild-type or flhF fliF double mutant strains. To accurately quantify this rare phenotype and ensure statistical reliability, we analyzed more cells for this particular strain. This explains why the flhF mutant dataset contains approximately double the number of cells compared to the other strains.

      We have redrawn Figure S4 to include a clear schematic diagram of the cell partitioning method and provided representative examples of each localization pattern (precise-polar, near-polar, and mid-cell) to better illustrate how we distinguished between these categories.

      (2) One thing that also confused me is the following: One point that the authors stress is that FlhF localizes both the flagellum and the chemoreceptors to the pole. However, if I look at Figure 2B, the flagellum and the chemoreceptors still co-localize together (although not at the pole). If FlhF was responsible for co-localizing both of them to the pole, then wouldn't one expect them to be randomly localized in this mutant and by that I mean that they do not co-localize but that each of them (the flagellum and the chemoreceptors) are located in a different random location of the cell (not co-localized). The fact that they are still co-localized together in this mutant could also be interpreted by, for example, that FlhF localizes the flagellum to the pole and another mechanism localizes the chemoreceptors to the flagellum, hence, they still co-localize in this mutant because the chemoreceptors follow the flagellum by another mechanism to wherever it goes?

      Thank you for this insightful observation. You are correct that our current experimental results do not definitively establish that FlhF directly localizes both the flagellum and chemoreceptors to the pole independently. The persistent colocalization of flagella and chemoreceptors in the DflhF mutant, even when both are mislocalized away from the pole, actually suggests a more complex regulatory mechanism than we initially proposed.

      This observation highlights an important distinction between polar targeting and colocalization maintenance. Our data suggest that FlhF influences the polar targeting of the flagellum-chemoreceptor assembly, but the colocalization itself appears to be governed by a different mechanism that operates independently of FlhF. This could involve direct protein-protein interactions between flagellar and chemotaxis components, or shared assembly machinery that we have yet to identify.

      To better reflect this interpretation, we have revised the subsection title (line 150). We have also modified the relevant discussion (line 180) to more accurately describe FlhF’s role in polar targeting rather than claiming it directly controls chemoreceptor localization.

      (3) In the response to reviewers, the authors mention "suggesting that the assembly of the receptor complex is likely influenced mainly by the C-ring and MS-ring structures rather than by the P ring". However, in the article, they still write "The complete assembly of the motor serves as a partial prerequisite for the assembly of the chemotaxis complex, and its assembly site is also regulated by the polar anchor protein FlhF" despite their FlgI results which is not in accordance with this statement? Also, As I mentioned in my previous report, in FliG and FliF mutant the motor does not assemble (see Hiroyuki Terashima et al. 2020., and Kaplan et al., 2022).

      We thank the reviewer for the suggestions and acknowledge the contradictions in our original text. You are correct that in DfliF and DfliG mutants, the flagellar motor does not assemble, while the P ring (FlgI) functions as a bushing for the peptidoglycan layer and its absence does not prevent motor assembly.

      Our DflgI results, which showed normal chemotaxis complex assembly similar to wild-type, clearly demonstrate that the P ring is not required for chemoreceptor complex formation. This contradicts our original statement that "complete assembly of the motor serves as a partial prerequisite for the assembly of the chemotaxis complex."

      We have corrected this inconsistency by: 1) Revising the subsection title (line 186) to more accurately reflect that core motor structures, rather than complete motor assembly, influences chemoreceptor complex formation. 2) Modifying sentences in the introduction (lines 97-98) to better align with our experimental findings.

      (4) The authors have said in their response to my point "and currently, there is no evidence that FliA activity is influenced by proteins like FliG". I just want to clarify what I meant in my previous report: In E. coli, FliA binds to FlgM, and when the hook is assembled FlgM is secreted outside the cell allowing FliA to trigger the transcription of class III genes, which include the chemosensory genes (see Figure 5 in Beeby et al, 2020 in FEMS Microbiology, and Chilcott and Hughes, 2000). This implies that if the hook is not built, then late genes (including the chemoreceptors) should not be present. However, in Kaplan et al., 2019, the authors imaged a FliF mutant in Shewanella oneidensis (Figure S3) and still saw that chemoreceptors are present (I believe the authors must highlight this). This suggests that species such as Shewanella and Pseudomonas have a different assembly process than that E. coli, and although the authors say that in the text, I believe they still can refine this part more in the spirit of what I wrote here.

      We thank the reviewer for the important clarification regarding the differences in transcriptional regulation among bacterial species. We agree that the observation of chemoreceptors in Shewanella oneidensis DfliF mutants (Kaplan et al., 2019) represents a significant deviation from the well-characterized E. coli model and merits stronger emphasis. In response, we have expanded the discussion to more clearly highlight the critical distinctions in the transcriptional regulatory circuits governing flagellar and chemoreceptor biogenesis between E. coli and species such as Shewanella oneidensis and Pseudomonas aeruginosa (lines 351-363).

      I do not like to ask for additional experiments in the second round of review, so for me if the authors modify the text to tackle these points and allow for probable alternative explanations/ highlight gaps/ modify language used for some claims, then that is fine with me.

      Reviewer #2 (Recommendations for the authors):

      It is plausible that colocalization reduces CheY levels throughout the cell in order to reduce cross-talk with c-di-GMP. This would mean that if physiologically-relevant levels of CheYp near the pole were present throughout the cell, c-di-GMP levels would be elevated to a point that is problematic for the cell. Clearly demonstrating this seems challenging.

      We acknowledge that directly proving the necessity of colocalization to prevent problematic c-di-GMP elevation is experimentally challenging, as it would require creating a system where CheY-P is artificially distributed throughout the cell at physiologically relevant concentrations while maintaining normal chemotaxis function.

      However, our data provide several lines of evidence supporting this model. First, we show that CheY overexpression leads to substantial c-di-GMP elevation (71.8% increase) and cell aggregation, demonstrating that elevated CheY levels can indeed cause problematic cross-pathway interference. Second, previous work has shown that CheY-P levels near the pole are an order of magnitude higher than in the rest of the cell (ref. 46). If this elevated CheY-P concentration near the pole were present throughout the cell, our data suggest that c-di-GMP levels would be elevated sufficiently to cause cell aggregation (Fig. 4A), thereby disabling normal motility and chemotaxis. Third, the dose-dependent relationship between CheY concentration and aggregation phenotype supports the idea that precise spatial regulation of CheY levels is functionally important for avoiding cross-pathway interference.

    1. eLife Assessment

      This important computational study investigates homeostatic plasticity mechanisms that neurons may employ to achieve and maintain stable target activity patterns. The work extends previous analyses of calcium-dependent homeostatic mechanisms based on ion channel density by considering activity-dependent shifts in channel activation and inactivation properties that operate on faster and potentially variable timescales. The model simulations provide solid evidence for the potential functional importance of these mechanisms.

    2. Reviewer #1 (Public review):

      This revision of the computational study by Mondal et al addresses several issues that I raised in the previous round of reviews and, as such, is greatly improved. The manuscript is more readable, its findings are more clearly described, and both the introduction and the discussion section are tighter and more to the point. And thank you for addressing the three timescales of half activation/inactivation parameters. It makes the mechanism clearer.

      Some issues remain that I bring up below.

      Comment:

      I still have a bone to pick with the claim that "activity-dependent changes in channel voltage-dependence alone are insufficient to attain bursting". As I mentioned in my previous comment, this is also the case for the gmax values (channel density). If you choose the gmax's to be in a reasonable range, then the statement above is simply cannot be true. And if, in contrast, you choose the activation/inactivation parameters to be unreasonable, then no set of gmax's can produce proper activity. So I remain baffled what exactly is the point that the authors are trying to make.

    3. Reviewer #2 (Public review):

      Summary:

      In this study, Mondal and co-authors present the development of a computational model of homeostatic plasticity incorporating activity-dependent regulation of gating properties (activation, inactivation) of ion channels. The authors show that, similar to what has been observed for activity-dependent regulation of ion channel conductances, implementing activity-dependent regulation of voltage sensitivity participates in the achievement of a target phenotype (bursting or spiking). The results however suggest that activity-dependent regulation of voltage sensitivity is not sufficient to allow this and needs to be associated with the regulation of ion channel conductances in order to reliably reach target phenotype. Although the implementation of this biologically relevant phenomenon is undeniably relevant, a few important questions are left unanswered.

      Strengths:

      (1) Implementing activity-dependent regulation of gating properties of ion channels is biologically relevant.

      (2) The modeling work appears to be well performed and provides results that are consistent with previous work performed by the same group.

      Weaknesses:

      (1) The main question not addressed in the paper is the relative efficiency and/or participation of voltage-dependence regulation compared to channel conductance in achieving the expected pattern of activity. Is voltage-dependence participating to 50% or 10%. Although this is a difficult question to answer (and it might even be difficult to provide a number), it is important to determine whether channel conductance regulation remains the main parameter allowing the achievement of a precise pattern of activity (or its recovery after perturbation).

      (2) Another related question is whether the speed of recovery is significantly modified by implemeting voltage-dependence regulation (it seems to be the case looking at Figure 3). More generally, I believe it would be important to give insights into the overall benefit of implementing voltage-dependence regulation, beyond its rather obvious biological relevance.

      (3) Along the same line, the conclusion about how voltage-dependence regulation and channel conductance regulation interact to provide the neuron with the expected activity pattern (summarized and illustrated in Figure 6) is rather qualitative. Consistent with my previous comments, one would expect some quantitative answers to this question, rather than an illustration that approximately places a solution in parameter space.

    4. Reviewer #3 (Public review):

      Mondal et al. use computational modeling to investigate how activity-dependent shifts in voltage-dependent (in)activation curves can complement changes in ion channel conductance to support homeostatic plasticity. While it is well established that the voltage-dependent properties of ion channels influence neuronal excitability, their potential role in homeostatic regulation, alongside conductance changes, has remained largely unexplored. The results presented here demonstrate that activity-dependent regulation of voltage dependence can interact with conductance plasticity to enable neurons to attain and maintain target activity patterns, in this case, intrinsic bursting. Notably, the timescale of these voltage-dependent shifts influences the final steady-state configuration of the model, shaping both channel parameters and activity features such as burst period and duration. A major conclusion of the study is that altering this timescale can seamlessly modulate a neuron's intrinsic properties, which the authors suggest may be a mechanism for adaptation to perturbations.

      While this conclusion is largely well-supported, additional analyses could help clarify its scope. For instance, the effects of timescale alterations are clearly demonstrated when the model transitions from an initial state that does not meet the target activity pattern to a new stable state. However, Fig. 6 and the accompanying discussion appear to suggest that changing the timescale alone is sufficient to shift neuronal activity more generally. It would be helpful to clarify that this effect primarily applies during periods of adaptation, such as neurodevelopment or in response to perturbations, and not necessarily once the system has reached a stable, steady state. As currently presented, the simulations do not test whether modifying the timescale can influence activity after the model has stabilized. In such conditions, changes in timescale are unlikely to affect network dynamics unless they somehow alter the stability of the solution, which is not shown here. That said, it seems plausible that real neurons experience ongoing small perturbations which, in conjunction with changes in timescale, could allow gradual shifts toward new solutions. This possibility is not discussed but could be a fruitful direction for future work.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Major comments:

      (1) The main issue that I have with this study is the lack of exploration of "why" the model produces the results it does. Considering this is a model, it should be possible to find out why the three timescales of half-act/inact parameter modifications lead to different sets of results. Without this, it is simply an exploratory exercise. (The model does this, but we do not know the mechanism.) Perhaps this is enough as an interesting finding, but it remains unconvincing and (clearly) does not have the impact of describing a potential mechanism that could be potentially explored experimentally.

      This is now addressed in a new section in Results (“Potential Mechanism”):

      “To explore why the properties of the resulting bursters depend on the timescale of half-(in)activation adjustments, we examined what happens when SP1 is assembled under different half-(in)activation timescales: (1) fast, (2) intermediate (matching the timescale of ion channel density changes), and (3) infinitely slow (i.e., effectively turned off). The effects of these timescales can be seen by comparing the zoomed-in views of the SP1 activity profiles under each condition (Figure 4).

      When half-(in)activations are fast, the time evolution of — which tracks how far the activity pattern is from its targets (see Methods)—shows an abrupt jump as it searches for a voltage-dependence configuration that meets calcium targets (Figure 4A). As this happens, the channel densities are slightly altered, and this process continues again. Slowing the half-(in)activations alterations reduces these abrupt fluctuations (Figure 4B). Making the alterations infinitely slow effectively removes half-(in)activation changes altogether, leaving the system reliant solely on slower alterations in maximal conductances (Figure 4C). Because each timescale of half-(in)activation produces a different channel repertoire at each time step, different timescales of half-(in)activation alteration led the model through a different path in the space of activity profiles and intrinsic properties. Ultimately, this resulted in distinct final activity patterns – all of which were consistent with the Ca<sup>2+</sup> targets [22].

      (2) A related issue is the use of bootstrapping to do statistics for a family of models, especially when the question is in fact the width of the distribution of output attributes. I don't buy this. One can run enough models to find say N number of models within a tight range (say 2% cycle period) and the same N number within a loose range (say 20%) and compare the statistics within the two groups with the same N.

      We appreciate the reviewer’s skepticism regarding our statistical approach with the “Group of 5” and “Group of 20.” These groups arose from historical aspects of our analysis and this analysis does not directly advance the main point—that changes in the timescale of channel voltage-dependence alterations impact the properties of bursters to which the homeostatic mechanism converges. Therefore, we removed the references to the Group of 5 and focus on how the Group of 20 responds to variations in the timescale of voltage-dependent alterations.

      (3) The third issue is that many of the results that are presented (but not the main one) are completely expected. If one starts with gmax values that would never work (say all of them 0), then it doesn't matter how much one moves the act/inact curves one probably won't get the desired activity. Alternately, if one starts with gmax values that are known to work and randomizes the act/inact midpoints, then the expectation would be that it converges to something that works. This is Figure 1 B and C, no surprise. But it should work the other way around too. If one starts with random act/inact curves that would never work and fixes those, then why would one expect any set of gmax values would produce the desired response? I can easily imagine setting the half-act/inact values to values that never produce any activity with any gmax.

      We appreciate this observation and agree that it highlights a limitation of our initial condition sampling. Our claim that the half-(in)activation mechanism is subordinate to the maximal conductance mechanism is not intended as a general statement. Rather, we make this observation only within the specific range of initial conditions we explored. Within this restricted set, we found that the conductance mechanism was sufficient for successful assembly, while the half-(in)activation mechanism alone was not. We have revised the manuscript to limit the claim.

      “The results shown in Figure 1A require activity-dependent regulation of the maximal conductances. When activity-dependent regulation of the maximal conductances is turned off, the model failed to assemble SP1 into a burster (Figure 1B). This was seen in the other 19 Starting Parameters (SP2-SP20), as well [22].

      (4) A potential response to my previous criticism would be that you put reasonable constraints on gmax's or half-act/inact values or tie the half-act to half-inact. But that is simply arbitrary ad hoc decisions made to make the model work, much like the L8-norm used to amplify some errors. There is absolutely no reason to believe this is tied to the biology of the system.

      Here the reviewer highlights that model choices (e.g., constraints on maximal conductance and half-(in)activation, use of the L8 norm) are not necessarily justified by biology. A discussion of the constraints on maximal conductance and half-(in)activation are in the Model Assumptions section at the end of Methods. The Methods also contains a longer discussion of the use of the L8 norm:

      “To compute this match score, we adapted a formulation from Alonso et al (2023),  who originally used a root-mean-square (RMS) or  norm to combine the sensor mismatches. In that approach, each error (, , and ) is divided by its allowable tolerance (, , and ) to produce a normalized error. These normalized errors are then squared, summed, and square-rooted to produce a single scalar score that reflects how well the model matches the target activity pattern.

      In our version, we instead used an  norm, which raises each normalized error to the 8th power before summing and taking the 1/8th root. This formulation emphasizes large deviations in any one sensor, making it easier to pinpoint which feature of the activity is limiting convergence. By amplifying outlier mismatches, this approach provided a clearer view of which sensor was driving model mismatch, helping us both interpret failure modes and tune the model’s sensitivity by adjusting the tolerances for individual sensor errors.

      Although the  norm emphasizes large deviations more strongly than the  norm, the choice of norm does not fundamentally alter which models can converge—a model that performs well under one norm can also be made to perform well under another by adjusting the allowable tolerances. The biophysical mechanisms by which neurons detect deviations from target activity and convert them into changes in ion channel properties are still not well understood. Given this uncertainty, and the fact that using different norms ultimately shouldn’t affect the convergence of a given model, the use of different norms to combine sensor errors is consistent with the broader basic premise of the model: that intrinsic homeostatic regulation is calcium mediated [22].

      (5) The discussion of this manuscript is at once too long and not adequate. It goes into excruciating detail about things that are simply not explored in this study, such as phosphorylation mechanisms, justification of model assumptions of how these alterations occur, or even the biological relevance. (The whole model is an oversimplification - lack of anatomical structure, three calcium sensors, arbitrary assumptions, and how parameter bounds are implemented.) Lengthy justifications for why channel density & half-act/inact of all currents are obeying the same time constant are answering a question that no one asked. It is a simplified model to make an important point. The authors should make these parts concise and to the point. More importantly, the authors should discuss the mechanism through which these differences may arise. Even if it is not clear, they should speculate.

      We agree. A long discussion on Model Assumptions and potential biological mechanisms that implement alteration in channel voltage-dependence obscure this. The former is relocated to the Methods section. The latter discussion is shortened. A discussion of a potential mechanism is included in the Results (Figure 4).

      (6) There should be some justification or discussion of the arbitrary assumptions made in the model/methods. I understand some of this is to resolve issues that had come up in previous iterations of this approach and in fact the Alonso et al, 2023 paper was mainly to deal with these issues. However, some level of explanation is needed, especially when assumptions are made simply because of the intuition of the modeler rather than the existence of a biological constraint or any other objective measure.

      A discussion of Model Assumptions is included in the Methods.

      Reviewer #2 (Public review):

      Summary:

      In this study, Mondal and co-authors present the development of a computational model of homeostatic plasticity incorporating activity-dependent regulation of gating properties (activation, inactivation) of ion channels. The authors show that, similar to what has been observed for activity-dependent regulation of ion channel conductances, implementing activity-dependent regulation of voltage sensitivity participates in the achievement of a target phenotype (bursting or spiking). The results however suggest that activity-dependent regulation of voltage sensitivity is not sufficient to allow this and needs to be associated with the regulation of ion channel conductances in order to reliably reach the target phenotype. Although the implementation of this biologically relevant phenomenon is undeniably relevant, the main conclusions of the paper and the insights brought by this computational work are difficult to grasp.

      Strengths:

      (1) Implementing activity-dependent regulation of gating properties of ion channels is biologically relevant.

      (2) The modeling work appears to be well performed and provides results that are consistent with previous work performed by the same group.

      Weaknesses:

      (1) The writing is rather confusing, and the state of the art explaining the need for the study is unclear.

      We reorganized the manuscript to make its focus clearer.

      Introduction: We clarified our explanation of the state-of-the-art. Briefly, prior work on activity-dependent homeostasis has focused on regulating ion channel density. Neurons have also been documented to homeostatically regulate channel voltage-dependence. However, the consequences of channel voltage-dependence alterations on homeostatic regulation remain underexplored. To study this, we extend a computational model of activity-dependent homeostasis — originally developed to only alter channel density— to alter channel voltage-dependence.

      Results: We reorganized this section to underscore the main point: that the timescale of half-(in)activation alterations influences the intrinsic properties and activity patterns targeted by a homeostatic mechanism. Figures 1A and 1B were retained to provide context—Figure 1A illustrates how activity can emerge from random initial conditions, while Figure 1B suggests that in these simulations, modulation of half-(in)activation played a specific limited role. Figure 2 builds on Figure 1A by summarizing how intrinsic properties and activity characteristics vary across a population of 20 bursters. Figure 3 then demonstrates that despite playing this specific limited role, altering the timescale of half-(in)activation in these simulations significantly impacted the intrinsic properties and activity characteristics of the bursters targeted by the homeostatic mechanism. Figure 4 supports this by offering a possible mechanistic explanation. Finally, Figure 5 reinforces the central message by showing how the same population responds to perturbation when the timescale of half-(in)activation alterations is varied—essentially extending the analysis of Figure 3 to a perturbed regime.

      Discussion: The Discussion concentrates on more specifically on how the timescale of half-(in)activation alterations shape bursters targeted he homeostatic mechanism. Extended content on model assumptions is moved to Methods. The discussion of biological pathways that implement channel voltage-dependence is shortened to avoid distracting from the main message.

      Methods: Aside from moving model assumptions here, we removed discussion of the “Group of 5” and explained in more detail why we chose the L8 norm.

      (2) The main outcomes and conclusions of the study are difficult to grasp. What is predicted or explained by this new version of homeostatic regulation of neuronal activity?

      Our message is general: the timescale of half-(in)activation alterations influences the intrinsic properties and activity characteristics of bursters targeted by a homeostatic mechanism. As such, the implications are general. Their value lies in circumscribing a conceptual framework from which experimentalists may devise and test new hypotheses. We do not aim to predict or explain any specific phenomenon in this work. To address this concern the Discussion highlights two potential implications of our findings—one to neuronal development and another to pathologies that may arise from disruptions to homeostatic processes:

      “One application for the simulations involving the self-assembly of activity may be to model the initial phases of neural development, when a neuron transitions from having little or no electrical activity to possessing it (Baccaglini & Spitzer 1977). As shown in Figure 6, the timescale of (in)activation curve alterations define a neuron's activity characteristics and intrinsic properties. As such, neurons may actively adjust these timescales to achieve a specific electrical activity aligned with a developmental phase’s activity targets. Indeed, developmental phases are marked by changes in ion channel density and voltage-dependence, leading to distinct electrical activity at each stage (Baccaglini & Spitzer 1977, Gao & Ziskind-Conhaim 1998, Goldberg et al 2011, Hunsberger & Mynlieff 2020, McCormick & Prince 1987, Moody & Bosma 2005, O'Leary et al 2014, Picken Bahrey & Moody 2003).

      Additionally, our results show that activity-dependent regulation of channel voltage-dependence can play a critical role in restoring neuronal activity during perturbations (Figure 5). Specifically, the presence and timing of half-(in)activation modulation influenced whether the model neuron could successfully return to its target activity pattern. Many model neurons only achieved recovery when a half-(in)activation mechanism was present. Moreover, the speed of this modulation shaped recovery outcomes in nuanced ways: some model neurons reached their targets only when voltage-dependence was adjusted rapidly, while others did so only when these changes occurred slowly. These observations all suggest that impairments in a neuron’s ability to modulate the voltage-dependence of its channels may lead to disruptions in activity-dependent homeostasis. This may have implications for conditions such as addiction (Kourrich et al 2015) and Alzheimer’s disease (Styr & Slutsky 2018), where disruptions in homeostatic processes are thought to contribute to pathogenesis.”

      Reviewer #3 (Public review):

      Mondal et al. use computational modeling to investigate how activity-dependent shifts in voltage-dependent (in)activation curves can complement activity-dependent changes in ion channel conductance to support homeostatic plasticity. While changes in the voltage-dependent properties of ion channels are known to modulate neuronal excitability, their role as a homeostatic plasticity mechanism interacting with channel conductance has been largely unexplored. The results presented here demonstrate that activity-dependent regulation of voltage-dependent properties can interact with plasticity in channel conductance to allow neurons to attain and maintain target activity patterns, in this case, intrinsic bursting. These results also show that the rate of channel voltage-dependent shifts can influence steady-state parameters reached as the model stabilizes into a stable intrinsic bursting state. That is, the rate of these modifications shapes the range of channel conductances and half-(in)activation parameters as well as activity characteristics such as burst period and duration. A major conclusion of the study is that altering the timescale of channel voltage dependence can seamlessly shift a neuron's activity characteristics, a mechanism that the authors argue may be employed by neurons to adapt to perturbations. While the study's conclusions are mostly well-supported, additional analyses, and simulations are needed.

      (1) A main conclusion of this study is that the speed at which (in)activation dynamics change determines the range of possible electrical patterns. The authors propose that neurons may dynamically regulate the timescale of these changes (a) to achieve alterations in electrical activity patterns, for example, to preserve the relative phase of neuronal firing in a rhythmic network, and (b) to adapt to perturbations. The results presented in Figure 4 clearly demonstrate that the timescale of (in)activation modifications impacts the range of activity patterns generated by the model as it transitions from an initial state of no activity to a final steady-state intrinsic burster. This may have important implications for neuronal development, as discussed by the authors.

      However, the authors also argue that the model neuron's dynamics - such as period, and burst duration, etc - could be dynamically modified by altering the timescale of (in)activation changes (Figure 6 and related text). The simulations presented here, however, do not test whether modifications in this timescale can shift the model's activity features once it reaches steady state. In fact, it is unlikely that this would be the case since, at steady-state, calcium targets are already satisfied. It is likely, however, as the authors suggest, that the rate at which (in)activation dynamics change may be important for neuronal adaptation to perturbations, such as changes in temperature or extracellular potassium. Yet, the results presented here do not examine how modifying this timescale influences the model's response to perturbations. Adding simulations to characterize how alterations in the rate of (in)activation dynamics affect the model's response to perturbations-such as transiently elevated extracellular potassium (Figure 5) - would strengthen this conclusion.

      The reviewer suggests that our core message — namely, that the timescale of half-(in)activation alterations influences the intrinsic properties and activity patterns targeted by a homeostatic mechanism — should also hold during perturbations. We agree that this extension strengthens the central message and have incorporated it into the subsection of the Results (“Half-(in)activation Alterations Contribute to Activity Homeostasis”) and Figure 5.

      (2) Another key argument in this study is that small, coordinated changes in channel (in)activation contribute to shaping neuronal activity patterns, but that, these subtle effects may be obscured when averaging across a population of neurons. This may be the case; however, the results presented don't clearly demonstrate this point. This point would be strengthened by identifying correlations, if they exist, between (in)activation curves, conductance, and the resulting bursting patterns of the models for the simulations presented in Figure 2 and Figure 4, for example. Alternatively, or additionally, relationships between (in)activation curves could be probed by perturbing individual (in)activation curves and quantifying how the other model parameters compensate, which could clearly illustrate this point.

      In part of the Discussion, we noted that small, coordinated shifts in half-(in)activation curves could be obscured when averaging across a population of neurons. Our intention was not to present this as a primary result, but to highlight an emergent consequence of the model: that distinct initial maximal conductances may converge to activity targets via different small shifts in half-(in)activation, making such changes difficult to detect at the population level. However, we did not systematically examine correlations between (in)activation parameters, conductances, and activity features, nor how these correlations might vary with the timescale of (in)activation modulation. While this observation is consistent with model behavior, it does not directly advance the study’s main point — that the timescale of half-(in)activation modulation influences the types of bursting patterns that satisfy the activity target. To keep the focus clear, we have removed this remark from the Discussion, though we agree that a more detailed analysis of these correlations may offer a fruitful direction for future work.

      Reviewer #1 (Recommendations for the authors):

      Minor comments:

      (1) Page 5: remove "an" from "achieve a given an activity..."

      The sentence containing this error has been removed.

      (2) Page 7, bottom of page. Explain what prespecifying means here. This requires a conceptual explanation, even if the equations are given in the methods. Was one working ad hoc model built from which the three sensor values were chosen? What was this model and how was it benchmarked? The sensors are never shown. In any figure, but presumably they have different kinetics. What is meant by "average value"? What was the window of averaging and why?

      The intention of this passage was to provide a broad overview of the homeostatic mechanism, with the rationale for using sensor “averages” as homeostatic targets explained in detail in the Methods. We have replaced the word “average” with “target” to maintain this focus.

      (3) Page 9: add "the" in "electrical activity of the neuron as [the] model seeks...".

      Done

      (4) Page 9: say briefly what alpha is before using it. Also, please be consistent in either using the symbol for alpha or spelling it out across the manuscript and the figures.

      Done

      (5) Page 10: the paragraph "In general, ..." is confusing although it becomes clear later on what this is all about. Please rewrite and expand this to clarify some points. For instance, the word "degenerate" is first used here and it is unclear in what sense these models are degenerate. Then it is unclear why the first 5 models were chosen and then 15 more added. What was the point of doing this? What is the intent? Set this up properly before saying that you just did it. This also would clarify the weird terminology used later on of Group of 20 vs. Group of 5. The 20 and 5 are arbitrary. Say what the purpose is. Finally, is the "mean" at the very end the same 416 ms? If not, what do you mean by "the mean"? In fact, I find these 2% and 20% to be imprecise substitutes of (say) two distinct values of CV which are an order of magnitude different. Is that the intent?

      This comment refers to a passage that was removed during revision.

      (6) Page 10: this may be clear to you, but it took me a while to understand that in Figure 1C, you took the working model at the end of 1A, fixed the gmax values and randomized just the half-act/inact values to run it. Perhaps rewrite this to clarify?

      This comment refers to a figure that was removed during revision.

      (7) Page 13: why do channel densities not change much after the perturbation?

      This comment refers to a figure that has since been reworked during revision. In particular, we only study what happens during perturbation. This question is interesting and is the subject of ongoing work.

      Reviewer #2 (Recommendations for the authors):

      The article should be carefully corrected, because the current quality of writing might obscure the interest of the study. Particular attention should be paid to the state-of-the-art section and to the discussion, but even the writing of the results should be carefully reworked. The current state of the article makes it very difficult to understand the motivation behind the study but also what the main result provided by this work is.

      The Introduction, Results, and Discussion have been reworked to build on the central premise of the work: the timescale of half-(in)activation alterations influences the intrinsic properties and activity patterns targeted by the neuron’s homeostatic mechanism. These changes are detailed in Public Comment #1.

      Reviewer #3 (Recommendations for the authors):

      The manuscript presents an interesting computational study exploring how activity-dependent regulation of (in)activation dynamics interacts with conductance plasticity to shape neuronal activity patterns. While the study provides valuable insights, some aspects would benefit from clarification, further analyses, and/or additional simulations to strengthen the conclusions. Below, I outline concerns and comments related to specific details of the model and results presentation that were not included in the public review.

      (1) The results presented in Figure 5 show that adaptation occurs in both channel conductances and (in)activation dynamics; however, the changes in conductance remain relatively permanent after the model recovers from the transient elevation in extracellular potassium. It therefore seems likely that the model would recover bursting more quickly in response to a subsequent exposure to simulated elevated extracellular potassium since large modifications in the slowly changing conductances would not be required. If this is the case, it could provide a plausible mechanism for adaptation to repeated high-potassium exposure, as demonstrated experimentally in Cancer borealis by this group (PMID: 36060056).

      This is an astute observation and the subject of our present follow-up investigation.

      (2) In the text relating to Figure 5, it is argued that the resulting shifts in (in)activation curves may be conceptualized as alterations in window currents. It would be helpful to illustrate this by plotting and comparing changes in window currents of these channels alongside the changes in their (in)activation curves.

      This comment refers to a passage that was removed during revision.

      (3) Some discussion of the role these homeostatic mechanisms may play when the neuron is synaptically integrated into a rhythmically active network could be informative. Surely, phasic and tonic inputs to the neuron would alter its conductance and voltage-dependent properties. Therefore, the model's parameters in an intact network could be very different from those in the synaptically isolated case.

      This is an excellent point. We agree that synaptic context—particularly tonic and phasic inputs—would likely influence a neuron’s conductances and voltage-dependent properties, potentially leading to different homeostatic outcomes than in the isolated case. While our current study focuses on synaptically isolated neurons, the Marder lab has considered how homeostatically stabilized neurons might interact in network settings. For example, O'Leary et al (2014) presents an example network of three such neurons operating under homeostatic regulation. However, systematically exploring this question remains a challenge. We are currently developing ideas to study this in the context of a simplified half-center oscillator model, where network-level dynamics can be more tractably analyzed.

      (4) Why are the transitions of alpha typically so abrupt, essentially either 1 or 0? Similarly, what happens in the model when there are transient transitions from what appears to be a steady-state alpha that abruptly shifts from 0 to 1 or 1 to 0? For example, what is occurring in Figure 1A at ~150s and ~180s when alpha jumps between 1 and 0, or in Figure 1B when the model transiently jumps up from 0 to 1 at ~400s and ~830s? In Figure 1A, does the bursting pattern change at all after ~250s, or is it identical to the pattern at c?

      This is addressed in the revision (Lines 141 – 150).

      (5) Are the final steady-state parameters of the 25 (sic) models consistent with experimental observations?

      It is difficult to assess — it is hard to design an experiment to do what the reviewer is suggesting.

      (6) Why isn't gL allowed to change dynamically? This seems like the most straightforward way to allow a neuron to adjust its excitability (aside from tonic synaptic inputs).

      Passive currents could, in principle, be subject to homeostatic regulation. However, our study focused on active intrinsic currents. This focus stems from earlier investigations, which showed that active currents are dynamically regulated during homeostasis – for instance Turrigiano et al (1995) and (Desai et al 1999).

      Alonso LM, Rue MCP, Marder E. 2023. Gating of homeostatic regulation of intrinsic excitability produces cryptic long-term storage of prior perturbations. Proc Natl Acad Sci U S A 120: e2222016120

      Baccaglini PI, Spitzer NC. 1977. Developmental changes in the inward current of the action potential of Rohon-Beard neurones. J Physiol 271: 93-117

      Desai NS, Rutherford LC, Turrigiano GG. 1999. Plasticity in the intrinsic excitability of cortical pyramidal neurons. Nature Neuroscience 2: 515-20

      Gao BX, Ziskind-Conhaim L. 1998. Development of ionic currents underlying changes in action potential waveforms in rat spinal motoneurons. J Neurophysiol 80: 3047-61

      Goldberg EM, Jeong HY, Kruglikov I, Tremblay R, Lazarenko RM, Rudy B. 2011. Rapid developmental maturation of neocortical FS cell intrinsic excitability. Cereb Cortex 21: 666-82

      Hunsberger MS, Mynlieff M. 2020. BK potassium currents contribute differently to action potential waveform and firing rate as rat hippocampal neurons mature in the first postnatal week. J Neurophysiol 124: 703-14

      Kourrich S, Calu DJ, Bonci A. 2015. Intrinsic plasticity: an emerging player in addiction. Nature Reviews Neuroscience 16: 173-84

      McCormick DA, Prince DA. 1987. Post-natal development of electrophysiological properties of rat cerebral cortical pyramidal neurones. J Physiol 393: 743-62

      Moody WJ, Bosma MM. 2005. Ion channel development, spontaneous activity, and activity-dependent development in nerve and muscle cells. Physiol Rev 85: 883-941

      O'Leary T, Williams AH, Franci A, Marder E. 2014. Cell types, network homeostasis, and pathological compensation from a biologically plausible ion channel expression model. Neuron 82: 809-21

      Picken Bahrey HL, Moody WJ. 2003. Early development of voltage-gated ion currents and firing properties in neurons of the mouse cerebral cortex. J Neurophysiol 89: 1761-73

      Styr B, Slutsky I. 2018. Imbalance between firing homeostasis and synaptic plasticity drives early-phase Alzheimer’s disease. Nature Neuroscience 21: 463-73

      Turrigiano G, LeMasson G, Marder E. 1995. Selective regulation of current densities underlies spontaneous changes in the activity of cultured neurons. J Neurosci 15: 3640-52

    1. eLife Assessment

      This valuable study demonstrates that D1- and D2-striatal neurons receive distinct cortical inputs, offering key insights into corticostriatal function. For instance, in the context of striatal-dependent learning, this distinction is highly informative for interpreting synaptic physiology data, particularly when inputs to one neuron subtype may change independently of the other. The strength of the evidence is solid, with anatomical and electrophysiological findings aligning well with results from optogenetic and behavioral studies.

    2. Reviewer #1 (Public review):

      Summary:

      The study by Klug et al. investigated the pathway specificity of corticostriatal projections, focusing on two cortical regions. Using a G-deleted rabies system in D1-Cre and A2a-Cre mice to retrogradely deliver channelrhodopsin to cortical inputs, the authors found that M1 and MCC inputs to direct and indirect pathway spiny projection neurons (SPNs) are both partially segregated and asymmetrically overlapping. In general, corticostriatal inputs that target indirect pathway SPNs are likely to also target direct pathway SPNs, while inputs targeting direct pathway SPNs are less likely to also target indirect pathway SPNs. Such asymmetric overlap of corticostriatal inputs has important implications for how the cortex itself may determine striatal output. Indeed, the authors provide behavioral evidence that optogenetic activation of M1 or MCC cortical neurons that send axons to either direct or indirect pathway SPNs can have opposite effects on locomotion and different effects on action sequence execution. The conclusions of this study add to our understanding of how cortical activity may influence striatal output and offer important new clues about basal ganglia function.

      The conceptual conclusions of the manuscript are supported by the data, but the details of the magnitude of afferent overlap and causal role of asymmetric corticostriatal inputs on some behavioral outcomes may be a bit overstated given technical limitations of the experiments.

      For example, after virally labeling either direct pathway (D1) or indirect pathway (D2) SPNs to optogenetically tag pathway-specific cortical inputs, the authors report that a much larger number of "non-starter" D2-SPNs from D2-SPN labeled mice responded to optogenetic stimulation in slices than "non-starter" D1 SPNs from D1-SPN labeled mice did. Without knowing the relative number of D1 or D2 SPN starters used to label cortical inputs, it is difficult to interpret the exact meaning of the lower number of responsive D2-SPNs in D1 labeled mice (where only ~63% of D1-SPNs themselves respond) compared to the relatively higher number of responsive D1-SPNs (and D2-SPNs) in D2 labeled mice. While relative differences in connectivity certainly suggest that some amount of asymmetric overlap of inputs exists, differences in infection efficiency and ensuing differences in detection sensitivity in slice experiments make determining the degree of asymmetry problematic.

      It is also unclear if retrograde labeling of D1-SPN- vs D2-SPN- targeting afferents labels the same densities of cortical neurons. This gets to the point of specificity in some of the behavioral experiments. If the target-based labeling strategies used to introduce channelrhodopsin into specific SPN afferents label significantly different numbers of cortical neurons, might the difference in the relative numbers of optogenetically activated cortical neurons itself lead to behavioral differences?

    3. Reviewer #2 (Public review):

      Summary:

      Klug et al. use monosynaptic rabies tracing of inputs to D1- vs D2-SPNs in the striatum to study how separate populations of cortical neurons project to D1- and D2-SPNs. They use rabies to express ChR2, then patch D1-or D2-SPNs to measure synaptic input. They report that cortical neurons labeled as D1-SPN-projecting preferentially project to D1-SPNs over D2-SPNs. In contrast, cortical neurons labeled as D2-SPN-projecting project equally to D1- and D2-SPNs. They go on to conduct pathway-specific behavioral stimulation experiments. They compare direct optogenetic stimulation of D1- or D2-SPNs to stimulation of MCC inputs to DMS and M1 inputs to DLS. In three different behavioral assays (open field, intra-cranial self-stimulation, and a fixed ratio 8 task), they show that stimulating MCC or M1 cortical inputs to D1-SPNs is similar to D1-SPN stimulation, but that stimulating MCC or M1 cortical inputs to D2-SPNs does not recapitulate the effects of D2-SPN stimulation (presumably because both D1- and D2-SPNs are being activated by these cortical inputs).

      Strengths:

      Showing these same effects in three distinct behaviors is strong. Overall, the functional verification of the consequences of the anatomy is very nice to see. It is a good choice to patch only from mCherry-negative non-starter cells in the striatum. This study adds to our understanding of the logic of corticostriatal connections, suggesting a previously unappreciated structure.

      Weaknesses:

      One limitation is that all inputs to SPNs are expressing ChR2, so they cannot distinguish between different cortical subregions during patching experiments. Their results could arise because the same innervation patterns are repeated in many cortical subregions or because some subregions have preferential D1-SPN input while others do not. There are also some caveats with respect to the efficacy of rabies tracing. Although they only patch non-starter cells in the striatum, only 63% of D1-SPNs receive input from D1-SPN-projecting cortical neurons. It's hard to say whether this is "high" or "low," but one question is how far from the starter cell region they are patching. Without this spatial indication of where the cells that are being patched are relative to the starter population, it is difficult to interpret if the cells being patched are receiving cortical inputs from the same neurons that are projecting to the starter population. The authors indicate they are patching from mCherry-negative neurons within the region of the mCherry-positive neurons, but since the mCherry population will include both true starter cells and monosynaptically connected cells, this is not perfectly precise. Convergence of cortical inputs onto SPNs may vary with distance from the starter cell region quite dramatically, as other mapping studies of corticostriatal inputs have shown specialized local input regions can be defined based on cortical input patterns (Hintiryan et al., Nat Neurosci, 2016, Hunnicutt et al., eLife 2016, Peters et al., Nature, 2021). A caveat for the optogenetic behavioral experiments is that these optogenetic experiments did not include fluorophore-only controls, although a different control (with light delivered in M1) is provided in Supplementary Figure 3. Another point of confusion is that other studies (Cui et al, J Neurosci, 2021) have reported that stimulation of D1-SPNs in DLS inhibits rather than promotes movement. This study may have given different results due to subtly different experimental parameters, including fiber optic placement and NA.

    4. Reviewer #3 (Public review):

      Review of resubmission: The authors provided a response to the reviews from myself and other reviewers. While some points were made satisfactorily, particularly in clarification of the innervation of cortex to striatum and the effects of input stimulation, many of my points remain unaddressed. In several cases, the authors chose to explain their rationale rather than address the issues at hand. A number of these issues (in fact, the majority) could be addressed simply by toning done the confidence in conclusions, so it was disappointing to see that the authors by and large did not do this. I repeat my concerns below and note whether I find them to have been satisfactorily addressed or not.

      In the manuscript by Klug and colleagues, the investigators use a rabies virus-based methodology to explore potential differences in connectivity from cortical inputs to the dorsal striatum. They report that the connectivity from cortical inputs onto D1 and D2 MSNs differs in terms of their projections onto the opposing cell type, and use these data to infer that there are differences in cross-talk between cortical cells that project to D1 vs. D2 MSNs. Overall, this manuscript adds to the overall body of work indicating that there are differential functions of different striatal pathways which likely arise at least in part by differences in connectivity that have been difficult to resolve due to difficulty in isolating pathways within striatal connectivity, and several interesting and provocative observations were reported. Several different methodologies are used, with partially convergent results, to support their main points.

      However, I have significant technical concerns about the manuscript as presented that make it difficult for me to interpret the results of the experiments. My comments are below.

      Major:<br /> There is generally a large caveat to the rabies studies performed here, which is that both TVA and the ChR2-expressing rabies virus have the same fluorophore. It is thus essentially impossible to determine how many starter cells there are, what the efficiency of tracing is, and which part of the striatum is being sampled in any given experiment. This is a major caveat given the spatial topography of the cortico-striatal projections. Furthermore, the authors make a point in the introduction about previous studies not having explored absolute numbers of inputs, yet this is not at all controlled in this study. It could be that their rabies virus simply replicates better in D1-MSNs than D2-MSNs. No quantifications are done, and these possibilities do not appear to have been considered. Without a greater standardization of the rabies experiments across conditions, it is difficult to interpret the results.

      This is still an issue. The authors point out why they chose various vectors. I can understand why the authors chose the fluorophores etc. that they did, yet the issues I raised previously are still valid. The discussion should mention that this is a potential issue. It does not necessarily invalidate results, but it is an issue. Furthermore, it is possible (in all systems) that rabies replicates better/more efficiently in some cells than others. This is one possible interpretation that has not really been explored in any study. I don't suggest the authors attempt to do that, but it should be raised as a potential interpretation. If the rabies results could mean several different things, the authors owe it to the readership to state all possible interpretations of data.

      The authors claim using a few current clamp optical stimulation experiments that the cortical cells are healthy, but this result was far from comprehensive. For example, membrane resistance, capacitance, general excitability curves, etc are not reported. In Figure S2, some of the conditions look quite different (e.g., S2B, input D2-record D2, the method used yields quite different results that the authors write off as not different). Furthermore, these experiments do not consider the likely sickness and death that occurs in starter cells, as has been reported elsewhere. Health of cells in the circuit is overall a substantial concern that alone could invalidate a large portion, if not all, of the behavioral results. This is a major confound given those neurons are thought to play critical roles in the behaviors being studied. This is a major reason why first-generation rabies viruses have not been used in combination with behavior, but this significant caveat does not appear to have been considered, and controls e.g., uninfected animals, infected with AAV helpers, etc, were not included.

      This issue remains unaddressed. I did not request clarity about experimental design, but rather, raised issues about the potential effects of toxicity. I believe this to be a valid concern that needs to be discussed in the manuscript, especially given what look visually like potential differences in S2.

      The overall purity (e.g., EnvA pseudotyping efficiency) of the RABV prep is not shown. If there was a virus that was not well EnvA-pseudotyped and thus could directly infect cortical (or other) inputs, it would degrade specificity.

      This issue has not been addressed. Viral strain is irrelevant. The quality of the specific preparations used is what matters.

      While most of the study focuses on the cortical inputs, in slice recordings, inputs from the thalamus are not considered, yet likely contribute to the observed results. Related to this, in in vivo optogenetic experiments, technically, if the thalamic or other inputs to the dorsal striatum project to the cortex, their method will not only target cortical neurons but also terminals of other excitatory inputs. If this cannot be ruled it, stating that the authors are able to selectively activate the cortical inputs to one or the other population should be toned down.

      The authors added text to the discussion to address this point. While it largely does what is intended, based on the one study cited, I disagree with the authors' conclusions that it is "clear" that potential contamination from other sites does not play a role. The simplest interpretation is the one the authors state, and there is some supporting evidence to back up that assertion, but to me that falls short of making the point "clear" that there are no other interpretations.

      The statements about specificity of connectivity are not well founded. It may be that in the specific case where they are assessing outside of the area of injections, their conclusions may hold (e.g., excitatory inputs onto D2s have more inputs onto D1s than vice versa). However, how this relates to the actual site of injection is not clear. At face value, if such a connectivity exists, it would suggest that D1-MSNs receive substantially more overall excitatory inputs than D2s. It is thus possible that this observation would not hold over other spatial intervals. This was not explored and thus the conclusions are over-generalized. e.g., the distance from the area of red cells in the striatum to recordings was not quantified, what constituted a high level of cortical labeling was not quantified, etc. Without more rigorous quantification of what was being done, it is difficult to interpret the results.

      Again, the goal here would be to make a statement about this in the discussion to clarify limitations of the study. I don't expect the authors to re-do all of these experiments, but since they are discussing the corticostriatal circuits, which have multiple subdomains, this remains a relevant point. It has not been addressed.

      The results in Figure 3 are not well controlled. The authors show contrasting effects of optogenetic stimulation of D1-MSNs and D2-MSNs in the DMS and DLS, results which are largely consistent with the canon of basal ganglia function. However, when stimulating cortical inputs, stimulating the inputs from D1-MSNs gives the expected results (increased locomotion) while stimulating putative inputs to D2-MSNs had no effect. This is not the same as showing a decrease in locomotion - showing no effect here is not possible to interpret.

      I think that the caveat of showing no clear effects of inputs to D2 stimulation should be pointed out. Yes, I understand that the viruses appeared to express etc., but again it remains possible that the results are driven by a lack of e.g., sufficient ChR2 expression. Aside from a full quantification of the number of cells expressing ChR2, overlap in fiber placement and ChR2 expression (which I don't suggest), this remains a possibility and should be pointed out, as it remains a possibility.

      In the light of their circuit model, the result showing that inputs to D2-MSNs drive ICSS is confusing. How can the authors account for the fact that these cells are not locomotor-activating, stimulation of their putative downstream cells (D2-MSNs) does not drive ICSS, yet the cortical inputs drive ICSS? Is the idea that these inputs somehow also drive D1s? If this is the case, how do D2s get activated, if all of the cortical inputs tested net activate D1s and not D2s? Same with the results in Figure 4 - the inputs and putative downstream cells do not have the same effects. Given potential caveats of differences in viral efficiency, spatial location of injections, and cellular toxicity, I cannot interpret these experiments.

      The explanation the authors provide in their rebuttal makes sense, however this should be included in the discussion of the manuscript, as it is interesting and relevant.

    1. eLife Assessment

      This fundamental work substantially advances our understanding of the molecular basis by which early symmetry breaking events connect to the following cell fate specifications in preimplantation mammalian embryos. The evidence supporting the conclusions is compelling, with advanced image based assays and microinjection based functional tests. The work will be of broad interest to cell and developmental biologists.

    2. Reviewer #1 (Public review):

      Summary:

      This work starts with the observation that embryo polarization is asynchronous starting at the early 8-cell stage, with early polarizing cells being biased towards producing the trophectoderm (TE) lineage. They further found that reduced CARM1 activity and upregulation of its substrate BAF155 promote early polarization and TE specification, this piece of evidence connects the previous finding that at Carm1 heterogeneity 4-cell stage guide later cell lineages - the higher Carm1-expressing blastomeres are biased towards ICM lineage. Thus, this work provides a link between asymmetries at the 4-cell stage and polarization at the 8-cell stage, providing a cohesive explanation regarding the first lineage allocation in mouse embryos.

      Strengths:

      In addition to what has been put in the summary, the advanced 3D image-based analysis has found that early polarization is associated with a change in cell geometry in blastomeres, regarding the ratio of the long axis to the short axis. This is considered a new observation that has not been identified.

      Weaknesses:

      For the microinjection-based method to overexpression/deletion of proteins, although it has been shown to be effective in the early embryo settings and has been widely used, it may not fully represent the in vivo situation in some cases, compared to other strategies such as the use of knock-in mice.

      This is a minor weakness and has been discussed by the author in the revised manuscript.

    1. eLife Assessment

      This manuscript applies state-of-the-art techniques to define the cellular composition of the dorsal vagal complex in two rodent species (mice and rats). The result is a fundamental resource that substantially advances our understanding of the dorsal vagal complex's role in the regulation of feeding and metabolism while also highlighting key differences between species. The analyses of single-cell profiling experiments in the manuscript provide compelling insight into the cellular architecture of the dorsal vagal complex, with potential implications for obesity therapeutics.

    2. Reviewer #1 (Public review):

      Summary:

      This paper is using state-of-the-art techniques to define the cellular composition and its complexity in two rodent species (mice and rats). The study is built on available datasets but extends those in a way that future research will be facilitated. The study will be of high impact for the study of metabolic control.

      Strengths:

      After revision, the paper is much improved. I have no further comments.

    3. Reviewer #2 (Public review):

      In this manuscript, Hes et al. present a comprehensive multi-species atlas of the dorsal vagal complex (DVC) using single-nucleus RNA sequencing, identifying over 180,000 cells and 123 cell types across five levels of granularity in mice and rats. Intriguingly, the analysis uncovered previously uncharacterized cell populations, including Kcnj3-expressing astrocytes, neurons co-expressing Th and Cck, and a population of leptin receptor-expressing neurons in the rat area postrema, which also express the progenitor marker Pdgfra. These findings suggest species-specific differences in appetite regulation. This study provides a valuable resource for investigating the intricate cellular landscape of the DVC and its role in metabolic control, with potential implications for refining obesity treatments targeting this hindbrain region.

      In line with previous work published by the PI, the topic is of clear scientific relevance, and the data presented in this manuscript are both novel and compelling. Additionally, the manuscript is well-structured, and the conclusions are robust and supported by the data. Overall, this study significantly enhances our understanding of the DVC and sheds light on key differences between rats and mice.

      I have reviewed the revised manuscript and am pleased to confirm that the authors have addressed my previous comments and concerns.

    1. eLife Assessment

      Cryptovaranoides, an end-Triassic animal (just over 200 Ma old), was originally described as a possibly anguimorph squamate, i.e., more closely related to snakes and some extant lizards than to other extant lizards, making Squamata much older than previously thought and providing a new calibration date inside it. Following a rebuttal and a defense, this fourth important contribution to the debate makes a meticulous and solid argument that Cryptovaranoides is not a squamate. However, further comparisons to potentially closely related animals would greatly benefit this study, and parts of the text require clarification.

    2. Reviewer #1 (Public review):

      In the Late Triassic and Early Jurassic (around 230 to 180 Ma ago), southern Wales and adjacent parts of England were a karst landscape. The caves and crevices accumulated remains of small vertebrates. These fossil-rich fissure fills are being exposed in limestone quarrying. In 2022 (reference 13 of the article), a partial articulated skeleton and numerous isolated bones from one fissure fill of end-Triassic age (just over 200 Ma) were named Cryptovaranoides microlanius and described as the oldest known squamate - the oldest known animal, by some 20 to 30 Ma, that is more closely related to snakes and some extant lizards than to other extant lizards. This would have considerable consequences for our understanding of the evolution of squamates and their closest relatives, especially for their speed and absolute timing, and was supported in the same paper by phylogenetic analyses based on different datasets.

      In 2023, the present authors published a rebuttal (reference 18) to the 2022 paper, challenging anatomical interpretations and the irreproducible referral of some of the isolated bones to Cryptovaranoides. Modifying the datasets accordingly, they found Cryptovaranoides outside Squamata and presented evidence that it is far outside. In 2024 (reference 19), the original authors defended most of their original interpretation and presented some new data, some of it from newly referred isolated bones. The present article discusses anatomical features and the referral of isolated bones in more detail, documents some clear misinterpretations, argues against the widespread but not justifiable practice of referring isolated bones to the same species as long as there is merely no known evidence to the contrary, further argues against comparing newly recognized fossils to lists of diagnostic characters from the literature as opposed to performing phylogenetic analyses and interpreting the results, and finds Cryptovaranoides outside Squamata again.

      Although a few of the character discussions and the discussion of at least one of the isolated bones can probably still be improved (and two characters are addressed twice), I see no sign that the discussion is going in circles or otherwise becoming unproductive. I can even imagine that the present contribution will end it.

    3. Reviewer #2 (Public review):

      Congratulations on this thorough manuscript on the phylogenetic affinities of Cryptovaranoides. Recent interpretations of this taxon, and perhaps some others, have greatly changed the field's understanding of reptile origins- for better and (likely) for worse.

      This manuscript offers a careful review of the features used to place Cryptovaranoides within Squamata and adequately demonstrates that this interpretation is misguided, and therefore reconciles morphological and molecular data, which is an important contribution to the field of paleontology. The presence of any crown squamate in the Permian or Triassic should be met with skepticism, the same sort of skepticism provided in this manuscript.

      I have outlined some comments addressing some weaknesses that I believe will further elevate the scientific quality of the work. A brief, fresh read‑through to refine a few phrases, particularly where the discussion references Whiteside et al. could also give the paper an even more collegial tone.

      This manuscript can be largely improved by additional discussion and figures, where applicable. When I first read this manuscript, I was a bit surprised at how little discussion there was concerning both non-lepidosauromorph lepidosaurs as well as stem-reptiles more broadly. This paper makes it extremely clear that Cryptovaranoides is not a squamate, but would greatly benefit in explaining why many of the characters either suggested by former studies to be squamate in nature or were optimized as such in phylogenetic analyses are rather widespread plesiomorphies present in crownward sauropsids such as millerettids, younginids, or tangasaurids. I suggest citing this work where applicable and building some of the discussion for a greatly improved manuscript. In sum:

      (1) The discussion of stem-reptiles should be improved. Nearly all of the supposed squamate features in Cryptovaranoides are present in various stem-reptile groups. I've noted a few, but this would be a fairly quick addition to this work. If this manuscript incorporates this advice, I believe arguments regarding the affinities of Cryptovaranoides (at least within Squamata) will be finished, and this manuscript will be better off for it.

      (2) I was also surprised at how little discussion there was here of putative stem-squamates or lepidosauromorphs more broadly. A few targeted comparisons could really benefit the manuscript. It is currently unclear as to why Cryptovaranoides could not be a stem-lepidosaur, although I know that the lepidosaur total-group in these manuscripts lacks character sampling due to their scarcity.

      (3) This manuscript can be improved by additional figures, such as the slice data of the humerus. The poor quality of the scan data for Cryptovaranoides is stated during this paper several times, yet the scan data is often used as evidence for the presence or absence of often minute features without discussion, leaving doubts as to what condition is true. Otherwise, several sections can be rephrased to acknowledge uncertainty, and probably change some character scorings to '?' in other studies.

    4. Reviewer #3 (Public review):

      Summary:

      The study provides an interesting contribution to our understanding of Cryptovaranoides relationships, which is a matter of intensive debate among researchers. My main concerns are in regard to the wording of some statements, but generally, the discussion and data are well prepared. I would recommend moderate revisions.

      Strengths:

      (1) Detailed analysis of the discussed characters.

      (2) Illustrations of some comparative materials.

      Weaknesses:

      Some parts of the manuscript require clarification and rewording.

      One of the main points of criticism of Whiteside et al. is using characters for phylogenetic considerations that are not included in the phylogenetic analyses therein. The authors call it a "non-trivial substantive methodological flaw" (page 19, line 531). I would step down from such a statement for the reasons listed below:

      (1) Comparative anatomy is not about making phylogenetic analyses. Comparative anatomy is about comparing different taxa in search of characters that are unique and characters that are shared between taxa. This creates an opportunity to assess the level of similarity between the taxa and create preliminary hypotheses about homology. Therefore, comparative anatomy can provide some phylogenetic inferences. That does not mean that tests of congruence are not needed. Such comparisons are the first step that allows creating phylogenetic matrices for analysis, which is the next step of phylogenetic inference. That does not mean that all the papers with new morphological comparisons should end with a new or expanded phylogenetic matrix. Instead, such papers serve as a rationale for future papers that focus on building phylogenetic matrices.

      (2) Phylogenetic matrices are never complete, both in terms of morphological disparity and taxonomic diversity. I don't know if it is even possible to have a complete one, but at least we can say that we are far from that. Criticising a work that did not include all the possibly relevant characters in the phylogenetic analysis is simply unfair. The authors should know that creating/expanding a phylogenetic matrix is a never-ending work, beyond the scope of any paper presenting a new fossil.

      (3) Each additional taxon has the possibility of inducing a rethinking of characters. That includes new characters, new character states, character state reordering, etc. As I said above, it is usually beyond the scope of a paper with a new fossil to accommodate that into the phylogenetic matrix, as it requires not only scoring the newly described taxon but also many that are already scored. Since the digitalization of fossils is still rare, it requires a lot of collection visits that are costly in terms of time.

      (4) If I were to search for a true flaw in the Whiteside et al. paper, I would check if there is a confirmation bias. The mentioned paper should not only search for characters that support Cryptovaranoides affinities with Anguimorpha but also characters that deny that. I am not sure if Whiteside et al. did such an exercise. Anyway, the test of congruence would not solve this issue because by adding only characters that support one hypothesis, we are biasing the results of such a test.

      To sum up, there is nothing wrong with proposing some hypotheses about character homology between different taxa that can be tested in future papers that will include a test of congruence. Lack of such a test makes the whole argumentation weaker in Whiteside et al., but not unacceptable, as the manuscript might suggest. My advice is to step down from such strong statements like "methodological flaw" and "empirical problems" and replace them with "limitations", which I think better describes the situation.

    1. eLife Assessment

      This revised manuscript provides fundamental findings on how the mouse barrel cortex connects to the dorsolateral striatum, uncovering that inputs from discrete whisker cortical columns are convergent and SPN-specific, but topographically organized at the population level. The evidence supporting this claim is compelling, demonstrating that SPNs uniquely integrate sparse input from variable stretches across the barrel cortex. The study would be of interest to basal ganglia and sensory-motor integration researchers.

    2. Reviewer #1 (Public review):

      Summary:

      By applying the laser scanning photostimulation (LSPS) approach to a novel slice preparation, the authors aimed to study the degree of convergence and divergence of cortical inputs to individual striatal projection neurons (SPNs).

      Strengths:

      The experiments were well-designed and conducted, and data analysis was thorough. The manuscript was well written and related work in the literature was properly discussed. This work has the potential to advance our understanding of how sensory inputs are integrated into the striatal circuits.

    3. Reviewer #2 (Public review):

      Summary:

      How corticostriatal synaptic connectivity gives rise to SPN encoding of sensory information is an important and currently unanswered question. The authors utilize a clever slice preparation in combination with electrophysiology and glutamate uncaging to dissect the synaptic connectivity between barrel cortex and individual striatal SPNs. In addition to mapping connectivity across major anatomical axes and cortical layers, the authors provide data showing that SPNs uniquely integrate sparse input from variable stretches across barrel cortex.

      Strengths:

      The methodology shows impressive rigor and the data robustly support the authors conclusions. Overall, the manuscript addresses its core question, provides valuable insights into corticostriatal architecture, and is a welcomed addition to the field.

    4. Reviewer #3 (Public review):

      Summary:

      The authors explored how individual dorsolateral striatum (DLS) spiny projection neurons (SPNs) receive functional input from whisker-related cortical columns. The authors developed and validated a novel slice preparation and method to which they applied rigorous functional mapping and thorough analysis. They found that individual SPNs were driven by sparse, scattered cortical clusters. Interestingly, while the cortical input fields of nearby SPNs had some degree of overlap, connectivity per SPN was largely distinct. Despite sparse, heterogeneous connectivity, topographical organization was identified. The authors lastly compared direct (D1) vs. indirect (D2) pathway cells, concluding that overall connectivity patterns were the same, but D1 cells received stronger input from L6 and D2 cells from L2/3. The paper thoughtfully addresses the question of whether barrel cortex broadly or selectively innervates SPNs. Their results indicate selective input that is loosely topographic. Their work deepens the understanding of how whisker-related somatosensory signals can drive striatal neurons.

      Strengths:

      Overall this is a carefully conducted study, and the major claims are well-supported. The use of a novel ex vivo slice prep that keeps relevant corticostriatal projections intact allows for careful mapping of the barrel cortex to dorsolateral striatum SPNs. Careful reporting of both columnar and layer position, as well as postsynaptic SPN type (D1 or D2) allows the authors to uncover novel details about how the dorsolateral striatum represents whisker-related sensory information.

      Weaknesses:

      Most technical weaknesses have now been addressed in the text.