10,000 Matching Annotations
  1. Last 7 days
    1. eLife Assessment

      This study presents a valuable theoretical exploration on the electrophysiological mechanisms of ionic currents via gap junctions in hippocampal CA1 pyramidal-cell models, and their potentially unique contribution to local field potentials (LFPs). The biophysical foundations of transmembrane electric dipoles, and the associated argument points, are generally compelling. Experimental constraints on gap junctions and strictly quantitative matching between chemical vs. junctional inputs have been hard to achieve. This computational investigation thus offers a specific way to enhance conceptual understanding and provides interesting testable predictions, which would be of great interest to experimental neurophysiologists who interpret relevant recordings.

    2. Reviewer #1 (Public review):

      This manuscript makes a significant contribution to the field by exploring the dichotomy between chemical synaptic and gap junctional contributions to extracellular potentials. While the study is comprehensive in its computational approach, adding experimental validation, network-level simulations, and expanded discussion on implications would elevate its impact further.

      Strengths:

      Novelty and Scope:

      The manuscript provides a detailed investigation into the contrasting extracellular field potential (EFP) signatures arising from chemical synapses and gap junctions, an underexplored area in neuroscience.<br /> It highlights the critical role of active dendritic processes in shaping EFPs, pushing forward our understanding of how electrical and chemical synapses contribute differently to extracellular signals.

      Methodological Rigor:

      The use of morphologically and biophysically realistic computational models for CA1 pyramidal neurons ensures that the findings are grounded in physiological relevance.<br /> Systematic analysis of various factors, including the presence of sodium, leak, and HCN channels, offers a clear dissection of how transmembrane currents shape EFPs.

      Biological Relevance:

      The findings emphasize the importance of incorporating gap junctional inputs in analyses of extracellular signals, which have traditionally focused on chemical synapses.<br /> The observed polarity differences and spectral characteristics provide novel insights into how neural computations may differ based on the mode of synaptic input.

      Clarity and Depth:

      The manuscript is well-structured, with logical progression from synchronous input analyses to asynchronous and rhythmic inputs, ensuring comprehensive coverage of the topic.

      Comments on revised version:

      The authors have addressed all my concerns in the revised version of the manuscript.

    3. Reviewer #2 (Public review):

      Summary:

      This computational work examines whether the inputs that neurons receive through electrical synapses (gap junctions) have different signatures in the extracellular local field potential (LFP) compared to inputs via chemical synapses. The authors present the results of a series of model simulations where either electric or chemical synapses targeting a single hippocampal pyramidal neuron are activated in various spatio-temporal patterns, and the resulting LFP in the vicinity of the cell is calculated and analyzed. The authors find several notable qualitative differences between the LFP patterns evoked by gap junctions vs. chemical synapses. For some of these findings, the authors demonstrate convincingly that the observed differences are explained by the electric vs. chemical nature of the input, and these results likely generalize to other cell types. However, in other cases, it remains plausible (or even likely) that the differences are caused, at least partly, by other factors (such as different intracellular voltage responses due to differences in the amplitudes and time courses of the input currents). Furthermore, it was not immediately clear to me how the results could be applied to analyze more realistic situations where neurons receive partially synchronized excitatory and inhibitory inputs via chemical and electric synapses.

      Strengths:

      The main strength of the paper is that it draws attention to the fact that inputs to a neuron via gap junctions are expected to give rise to a different extracellular electric field compared to inputs via chemical synapses, even if the intracellular effects of the two types of input are similar. This is because, unlike chemical synaptic inputs, inputs via gap junctions are not directly associated with transmembrane currents. This is a general result that holds independent of many details such as the cell types or neurotransmitters involved.

      Another strength of the article is that the authors attempt to provide intuitive, non-technical explanations of most of their findings, which should make the paper readable also for non-expert audiences (including experimentalists).

      Weaknesses:

      The most problematic aspect of the paper relates to the methodology for comparing the effects of electric vs. chemical synaptic inputs on the LFP. The authors seem to suggest that the primary cause of all the differences seen in the various simulation experiments is the different nature of the input, and particularly the difference between the transmembrane current evoked by chemical synapses and the gap junctional current that does not involve the extracellular space. However, this is clearly an oversimplification: since no real attempt is made to quantitatively match the two conditions that are compared (e.g., regarding the strength and temporal profile of the inputs), the differences seen can be due to factors other than the electric vs. chemical nature of synapses. In fact, if inputs were identical in all parameters other than the transmembrane vs. directly injected nature of the current, the intracellular voltage responses and, consequently, the currents through voltage-gated and leak currents would also be the same, and the LFPs would differ exactly by the contribution of the transmembrane current evoked by the chemical synapse. This is evidently not the case for any of the simulated comparisons presented, and the differences in the membrane potential response are rather striking in several cases (e.g., in the case of random inputs, there is only one action potential with gap junctions, but multiple action potentials with chemical synapses). Consequently, it remains unclear which observed differences are fundamental in the sense that they are directly related to the electric vs. chemical nature of the input, and which differences can be attributed to other factors such as differences in the strength and pattern of the inputs (and the resulting difference in the neuronal electric response).

      Some of the explanations offered for the effects of cellular manipulations on the LFP appear to be incomplete. More specifically, the authors observed that blocking leak channels significantly changed the shape of the LFP response to synchronous synaptic inputs - but only when electric inputs were used, and when sodium channels were intact. The authors seemed to attribute this phenomenon to a direct effect of leak currents on the extracellular potential - however, this appears unlikely both because it does not explain why blocking the leak conductance had no effect in the other cases, and because the leak current is several orders of magnitude smaller than the spike-generating currents that make the largest contributions to the LFP. An indirect effect mediated by interactions of the leak current with some voltage-gated currents appears to be the most likely explanation, but identifying the exact mechanism would require further simulation experiments and/or a detailed analysis of intracellular currents and the membrane potential in time and space.

      In every simulation experiment in this study, inputs through electric synapses are modeled as intracellular current injections of pre-determined amplitude and time course based on the sampled dendritic voltage of potential synaptic partners. This is a major simplification that may have a significant impact on the results. First, the current through gap junctions depends on the voltage difference between the two connected cellular compartments and is thus sensitive to the membrane potential of the cell that is treated as the neuron "receiving" the input in this study (although, strictly speaking, there is no pre- or postsynaptic neuron in interactions mediated by gap junctions). This dependence on the membrane potential of the target neuron is completely missing here. A related second point is that gap junctions also change the apparent membrane resistance of the neurons they connect, effectively acting as additional shunting (or leak) conductance in the relevant compartments. This effect is completely missed by treating gap junctions as pure current sources.

      One prominent claim of the article that is emphasized even in the abstract is that HCN channels mediate an outward current in certain cases. Although this statement is technically correct, there are two reasons why I do not consider this a major finding of the paper. First, as the authors acknowledge, this is a trivial consequence of the relatively slow kinetics of HCN channels: when at least some of the channels are open, any input that is sufficiently fast and strong to take the membrane potential across the reversal potential of the channel will lead to the reversal of the polarity of the current. This effect is quite generic and well-known, and is by no means specific to gap junctional inputs or even HCN channels. Second, and perhaps more importantly, the functional consequence of this reversed current through HCN channels is likely to be negligible. As clearly shown in Supplementary Figure S4, the HCN current becomes outward only for an extremely short time period during the action potential, which is also a period when several other currents are also active and likely dominant due to their much higher conductances. I also note that several of these relevant facts remain hidden in Figure 3, both because of its focus on peak values, and because of the radically different units on the vertical axes of the current plots.

      Finally, I missed an appropriate validation of the neuronal model used, and also the characterization of the effects of the in silico manipulations used on the basic behavior of the model. As far as I understand, the model in its current form has not been used in other studies, although it is closely related to models used in earlier modeling work from the same laboratory. If this is the case, it would be important to demonstrate convincingly through (preferably quantitative) comparisons with experimental data using different protocols that the model captures the physiological behavior of at least the relevant compartments (in this case, the dendrites and the soma) of hippocampal pyramidal neurons sufficiently well that the results of the modeling study are relevant to the real biological system. In addition, the correct interpretation of various manipulations of the model would be strongly facilitated by investigating and discussing how the physiological properties of the model neuron are affected by these alterations.

      Comments on revised version:

      The authors made mainly cosmetic changes in the manuscript (primarily by adding more discussion), and most of these do not affect my earlier assessment. I have updated my Public Review in a few places to reflect those few changes that substantially address my previous concerns.

    4. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment:

      This study presents a valuable theoretical exploration on the electrophysiological mechanisms of ionic currents via gap junctions in hippocampal CA1 pyramidal-cell models, and their potential contribution to local field potentials (LFPs) that is different from the contribution of chemical synapses. The biophysical argument regarding electric dipoles appears solid, but the evidence can be more convincing if their predictions are tested against experiments. A shortage of model validation and strictly comparable parameters used in the comparisons between chemical vs. junctional inputs makes the modeling approach incomplete; once strengthened, the finding can be of broad interest to electrophysiologists, who often make recordings from regions of neurons interconnected with gap junctions.

      We gratefully thank the editors and the reviewers for the time and effort in rigorously assessing our manuscript, for the constructive review process, for their enthusiastic responses to our study, and for the encouraging and thoughtful comments. We especially thank you for deeming our study to be a valuable exploration on the differential contributions of active dendritic gap junctions vs. chemical synapses to local field potentials. We thank you for your appreciation of the quantitative biophysical demonstration on the differences in electric dipoles that appear in extracellular potentials with gap junctions vs. chemical synapses.

      However, we are surprised by aspects of the assessment that resulted in deeming the approach incomplete, especially given the following with specific reference to the points raised:

      (1) Testing against experiments: With specific reference to gap junctions, quantitative experimental verification becomes extremely difficult because of the well-established non specificities associated with gap junctional modulators (Behrens et al., 2011; Rouach et al., 2003). In addition, genetic knockouts of gap junctional proteins are either lethal or involve functional compensation (Bedner et al., 2012; Lo, 1999), together making causal links to specific gap junctional contributions with currently available techniques infeasible.

      In addition, the complex interactions between co-existing chemical synaptic, gap junctional, and active dendritic contributions from several cell-types make the delineation of the contributions of specific components infeasible with experimental approaches. A computational approach is the only quantitative route to specifically delineate the contributions of individual components to extracellular potentials, as seen from studies that have addressed the question of active dendritic contributions to field potentials (Halnes et al., 2024; Ness et al., 2018; Reimann et al., 2013; Sinha & Narayanan, 2015, 2022) or spiking contributions to local field potentials (Buzsaki et al., 2012; Gold et al., 2006; Schomburg et al., 2012). The biophysically and morphologically realistic computational modeling route is therefore invaluable in assessing the impact of individual components to extracellular field potentials (Einevoll et al., 2019; Halnes et al., 2024).

      Together, we emphasize that the computational modeling route is currently the only quantitative methodology to delineate the contributions of gap junctions vs. chemical synapses to extracellular potentials.

      (2) Model validation: The model used in this study was adopted from a physiologically validated model from our laboratory (Roy & Narayanan, 2021). Please note that the original model was validated against several physiological measurements along the somatodendritic axis. We sincerely regret our oversight in not mentioning clearly that we have used an existing, thoroughly physiologically-validated model from our laboratory in this study.

      (3) Comparisons between chemical vs. junctional inputs: We had taken elaborate precautions in our experimental design to match the intracellular electrophysiological signatures with reference to synchronous as well as oscillatory inputs, irrespective of whether inputs arrived through gap junctions or chemical synapses. A new Supplementary Figure S3 has been added to address this concern raised by the reviewers.

      In the revised manuscript, we have addressed all the concerns raised by the reviewers in detail. We have provided point-by-point responses to reviewers’ helpful and constructive comments below. We thank the editors and the reviewers for this constructive review process, which helped us in improving our manuscript with specific reference to emphasizing the novelty of our approach and conclusions. The specific changes incorporated into the revised manuscript are detailed below.

      Reviewer #1 (Public review):

      This manuscript makes a significant contribution to the field by exploring the dichotomy between chemical synaptic and gap junctional contributions to extracellular potentials. While the study is comprehensive in its computational approach, adding experimental validation, network-level simulations, and expanded discussion on implications would elevate its impact further.

      We gratefully thank you for your time and effort in rigorously assessing our manuscript, for the enthusiastic response, and the encouraging and thoughtful comments on our study. In what follows, we have provided point-by-point responses to the specific comments.

      Strengths

      Novelty and Scope

      The manuscript provides a detailed investigation into the contrasting extracellular field potential (EFP) signatures arising from chemical synapses and gap junctions, an underexplored area in neuroscience. It highlights the critical role of active dendritic processes in shaping EFPs, pushing forward our understanding of how electrical and chemical synapses contribute differently to extracellular signals.

      We thank you for the positive comments on the novelty of our approach and how our study addresses an underexplored area in neuroscience. The assumptions about the passive nature of dendritic structures had indeed resulted in an underestimation of the contributions of gap junctions to extracellular potentials. Once the realities of active structures are accounted for, the contributions of gap junctions increases by several orders of magnitude compared to passive structures (Fig. 1D).

      Methodological Rigor

      The use of morphologically and biophysically realistic computational models for CA1 pyramidal neurons ensures that the findings are grounded in physiological relevance. Systematic analysis of various factors, including the presence of sodium, leak, and HCN channels, offers a clear dissection of how transmembrane currents shape EFPs.

      We thank you for your encouraging comments on the experimental design and methodological rigor of our approach.

      Biological Relevance

      The findings emphasize the importance of incorporating gap junctional inputs in analyses of extracellular signals, which have traditionally focused on chemical synapses. The observed polarity differences and spectral characteristics provide novel insights into how neural computations may differ based on the mode of synaptic input.

      We thank you for your positive comments on the biological relevance of our approach. We also gratefully thank you for emphasizing the two striking novelties unveiling the dichotomy between gap junctions and chemical synapses in their contributions to field potentials: polarity differences and spectral characteristics.

      Clarity and Depth

      The manuscript is well-structured, with a logical progression from synchronous input analyses to asynchronous and rhythmic inputs, ensuring comprehensive coverage of the topic.

      We sincerely thank you for the positive comments on the structure and comprehensive coverage of our manuscript encompassing different types of inputs that neurons typically receive.

      Weaknesses and Areas for Improvement

      Generality and Validation

      The study focuses exclusively on CA1 pyramidal neurons. Expanding the analysis to other cell types, such as interneurons or glial cells, would enhance the generalizability of the findings. Experimental validation of the computational predictions is entirely absent. Empirical data correlating the modeled EFPs with actual recordings would strengthen the claims.

      We thank you for raising this important point. The prime novelty and the principal conclusion of this study is that gap junctional contributions to extracellular field potentials are orders of magnitude higher when the active nature of cellular compartments are accounted for. The lacuna in the literature has been consequent to the assumption that cellular compartments are passive, resulting in the dogma that gap junctional contributions to field potentials are negligible. Despite knowledge about active dendritic structures for decades now, this assumption has kept studies from understanding or even exploring the contributions of gap junctions to field potentials. The rationale behind the choice of a computational approach to address the lacuna were as follows:

      (1) The complex interactions between co-existing chemical synaptic, gap junctional, and active dendritic contributions from several cell-types make the delineation of the contributions of specific components infeasible with experimental approaches. A computational approach is the only quantitative route to specifically delineate the contributions of individual components to extracellular potentials, as seen from studies that have addressed the question of active dendritic contributions to field potentials (Halnes et al., 2024; Ness et al., 2018; Reimann et al., 2013; Sinha & Narayanan, 2015, 2022) or spiking contributions to local field potentials (Buzsaki et al., 2012; Gold et al., 2006; Schomburg et al., 2012). The biophysically and morphologically realistic computational modeling route is therefore invaluable in assessing the impact of individual components to extracellular field potentials (Einevoll et al., 2019; Halnes et al., 2024).

      (2) With specific reference to gap junctions, quantitative experimental verification becomes extremely difficult because of the well-established non-specificities associated with gap junctional modulators (Behrens et al., 2011; Rouach et al., 2003). 'The non-specific actions of gap junctions are tabulated in Table 2 of (Szarka et al., 2021). In addition, genetic knockouts of gap junctional proteins are either lethal or involve functional compensation (Bedner et al., 2012; Lo, 1999), together making causal links to specific gap junctional contributions with currently available techniques infeasible.

      We highlight the novelty of our approach and of the conclusions about differences in extracellular signatures associated with active-dendritic chemical synapses and gap junctions, against these experimental difficulties. We emphasize that the computational modeling route is currently the only quantitative methodology to delineate the contributions of gap junctions vs. chemical synapses to extracellular potentials. Our analyses clearly demonstrates that gap junctions do contribute to extracellular potentials if the active nature of the cellular compartments is explicitly accounted for (Fig. 1D). We also show theoretically well-grounded and mechanistically elucidated differences in polarity (Figs. 1–3) as well as in spectral signatures (Figs. 5–8) of extracellular potentials associated with gap junctional vs. chemical synaptic inputs. Together, our fundamental demonstration in this study is the critical need to account for the active nature of cellular compartments in studying gap junctional contributions of extracellular potentials, with CA1 pyramidal neuronal dendrites used as an exemplar.

      In the revised version of the manuscript, we have emphasized the motivations for the approach we took, highlighting the specific novelties both in methodological and conceptual aspects, finally emphasizing the need to account for other cell types and gap junctional contributions therein. Importantly, we have emphasized the non-specificities associated with gap-junctional blockers as the reason why experimental delineation of gap junctional vs. chemical synaptic contributions to LFP becomes tedious. We believe that these points underscore the need for the computational approach that we took to address this important question, apart from the novelties of the study.

      In response to your constructive comments, we have added the following to the revised version of the manuscript, in the Introduction section as motivation for the specific route we took:

      “Given the complexity arising from the concurrent activity of chemical synapses, gap junctions, and active dendritic conductances across multiple neuronal populations, experimentally isolating the contributions of individual components to extracellular potentials remains highly challenging. To address this limitation, we employed a computational modeling approach, which provides a quantitative framework for systematically dissecting the distinct roles of specific cellular and synaptic elements. This strategy is consistent with previous studies that have successfully used computational methods to elucidate the contributions of active dendritic mechanisms to LFPs (Halnes et al., 2024; Ness et al., 2018; Reimann et al., 2013; Sinha & Narayanan, 2015, 2022) or spiking contributions to LFPs (Buzsaki et al., 2012; Gold et al., 2006; Schomburg et al., 2012). In addition, experimentally isolating the contribution of gap junctions is complicated by non-specific effects of available pharmacological modulators targeting these connections (Behrens et al., 2011; Rouach et al., 2003). Most genetic knockouts of gap junctional proteins are either lethal or trigger functional compensatory mechanisms (Bedner et al., 2012; Lo, 1999), thereby rendering causal attribution of specific gap junctional contributions infeasible with currently available experimental approaches. Consequently, biophysically and morphologically detailed computational modeling provides a crucial means to evaluate the impact of individual neuronal components on extracellular field potentials (Einevoll et al., 2019; Halnes et al., 2024).”

      We thank you for raising this point as this allowed us to expand on the specific motivations for the approach we took, and to present the specific novelties of our study to the analyses of extracellular field potentials. Thank you.

      Role of Active Dendritic Currents

      The paper emphasizes active dendritic currents, particularly the role of HCN channels in generating outward currents under certain conditions. However, further discussion of how this mechanism integrates into broader network dynamics is warranted.

      We thank you for this constructive suggestion. We agree that it is important to consider the implications for broader network dynamics of the outward HCN currents that are observed with synchronous inputs. In the revised manuscript, we have elaborated on the implications of the outward HCN current to network dynamics in detail. The following paragraph has been added to Discussion subsection on “Outward HCN currents regulate extracellular potentials”:

      “HCN channels play a critical role in shaping hippocampal network dynamics by modulating neuronal excitability, oscillatory behavior, and susceptibility to pathological states (Kessi et al., 2022; Magee, 1998; Mishra & Narayanan, 2025; Nolan et al., 2004). The outward-like properties of the HCN current we observed may have specific functional implications at different scales. At the cellular scale, the manifestation of outward current during action potentials or plateau potentials could contribute to after hyperpolarization thereby regulating firing properties. In cortical and hippocampal pyramidal neurons, most single-neuron processing occurs in their elaborate dendritic branches, where there is spatiotemporal summation of different synaptic potentials, plateau potentials, back propagating action potentials, and dendritic spikes (Johnston & Narayanan, 2008; Major et al., 2013; Stuart & Spruston, 2015). Considering the heavy expression of HCN channels in the dendrites of hippocampal and cortical pyramidal neurons (Kole et al., 2006; Lorincz et al., 2002; Magee, 1998; Williams & Stuart, 2000), the back propagating action potentials, plateau potentials, or dendritic spikes at dendritic location could yield outward currents. These outward currents could act as a hyperpolarizing mechanism that suppresses spatiotemporal summation of the different dendritic potentials.

      At the network scale, such regulation of dendritic potentials and somatic firing could contribute to overall reduction in firing rates of different neurons in the network. For instance, as inhibitory neurons typically elicit action potentials at higher frequencies, somatic outward HCN currents would occur more frequently in inhibitory neurons that express HCN channels compared to excitatory neurons. However, the heavy expression of HCN channels in the dendrites and the higher prevalence of dendritic spikes and plateau potentials in dendrites (Basak & Narayanan, 2018; Larkum et al., 2022; Moore et al., 2017) imply that the impact on outward HCN currents might be higher. Thus, the presence of outward HCN currents would regulate network balance of excitation inhibition in an activity-dependent manner. Additionally, the outward component of the current through HCN channels could contribute to stabilization of network synchrony by promoting spike phase coherence and to modulation of spike-LFP phase relationships (Das et al., 2017; Ness et al., 2016, 2018; Seenivasan & Narayanan, 2020; Sinha & Narayanan, 2015, 2022).

      Together, the outward HCN current could play critical roles in regulating several cellular and network functions including spatiotemporal summation within single neurons, amplitude and phase of different oscillations, excitatory-inhibitory interactions, and rate and temporal coding involved in spatial navigation (Hussaini et al., 2011; Nolan et al., 2004; O'Keefe & Recce, 1993). In the context of brain rhythms, future investigations are needed to explore ripple-frequency oscillations, specifically to assess whether high-frequency network interactions are modulated by HCN outward currents. Importantly, future studies could specifically focus on delineating the prevalence and specific contributions of outward currents through HCN channels to single-neuron and network physiology.”

      We thank you for highlighting this point, as it allowed us to elaborate the broader roles of HCN channels to single-cell computation, network dynamics, and field potentials. Thank you.

      Analysis of Plasticity

      While the manuscript mentions plasticity in the discussion, there are no simulations that account for activity-dependent changes in synaptic or gap junctional properties. Including such analyses could significantly enhance the relevance of the findings.

      We thank you for this constructive suggestion. Please note that we have presented consistent results for both fewer and more gap junctions in our analyses (Figure 1 with 217 gap junctions and Supplementary Figure 1 with 99 gap junctions). Thus, our fundamentally novel result that gap junctions onto active dendrites differentially shape LFPs holds true irrespective of the relative density of gap junctions onto the neuron. Thus, these results demonstrate that the conclusions about their contributions to LFP are invariant to plasticity in their gap junctional numerosity.

      We had only briefly mentioned plasticity in the Introduction to highlight the different modes of synaptic transmission and to emphasize that plasticity has been studied in both chemical synapses and gap junctions, playing a role in learning and adaptation. However, it seems that this wording inadvertently suggested that our study includes plasticity simulations. Therefore, we have removed that sentence from Introduction in the revised manuscript to ensure clarity.

      In the ‘Limitations of analyses and future studies’ section in Discussion, we suggested investigating the impact of plasticity mechanisms—specifically, activity-dependent plasticity of ion channels—on synaptic receptors vs. gap junctions and their effects on extracellular field potentials under various input conditions and plasticity combinations across different structures. We fully agree with the reviewer that such studies would offer valuable insights and further enhance the broader relevance of our findings. However, while our study implies this direction, it was not the primary focus of our investigation.

      In the revised manuscript, we have also expanded on intrinsic/synaptic plasticity and how they could contribute to LFPs (Sinha & Narayanan, 2015, 2022), while also pointing to simulations with different numbers of gap junction in this context. The following specific changes have been incorporated to the revised manuscript:

      Discussion subsection “Limitations of analyses and future directions”

      “We demonstrated that the contribution of gap junctions to extracellular field potentials remains consistent regardless of the number of gap junctions. Specifically, we showed that the distinct positive LFP deflections persisted irrespective of their relative density on neurons (Fig. 1 with 217 gap junctions and Supplementary Fig. 1 with 99 gap junctions). Previous studies have quantitatively demonstrated that intrinsic and synaptic plasticity modulate hippocampal LFPs and phase coding (Sinha & Narayanan, 2015, 2022). Future analyses should also assess the impact of activity-dependent plasticity in ion channels (on dendrites, axonal initial segments, and other compartments), in synaptic receptors, and in gap junctions (Andersen et al., 2006; Coulon & Landisman, 2017; Johnston & Narayanan, 2008; Magee & Grienberger, 2020; Mishra & Narayanan, 2021; Neves et al., 2008; O'Brien, 2014; Pereda, 2014; Vaughn & Haas, 2022) on extracellular potentials with various kinds of gap junctional inputs and different combinations of plasticity in various structures. Interactions among different forms of plasticity and how co-dependent plasticity in different components alters extracellular field potentials could provide deeper insights about physiological changes during learning and pathological changes observed in different neurological disorders (Sinha & Narayanan, 2022).”

      We thank you for highlighting this as this allowed us to improve on the specific focus of the manuscript and the study. Thank you.

      Frequency-Dependent Effects

      The study demonstrates that gap junctional inputs suppress highfrequency EFP power due to membrane filtering. However, it could delve deeper into the implications of this for different brain rhythms, such as gamma or ripple oscillations.

      We sincerely thank you for these insightful comments that we totally agree with. As it so happens, this manuscript forms the first part of a broader study where we explore the implications of gap junctions to ripple frequency oscillations. The ripple oscillations part of the work was presented as a poster in the Society for Neuroscience (SfN) annual meeting 2024 (Sirmaur & Narayanan, 2024). There, we simulate a neuropil made of hundreds of morphologically realistic neurons to assess the role of different synaptic inputs excitatory, inhibitory, and gap junctional and active dendrites to ripple frequency oscillations. We demonstrate there that the conclusions from single-neuron simulations in this current manuscript extend to a neuropil with several neurons, each receiving excitatory, inhibitory and gap-junctional inputs, especially with reference to high-frequency oscillations. Our network based analyses unveiled a dominant mediatory role of patterned inhibition in ripple generation, with recurrent excitations through chemical synapses and gap junctions in conjunction with return-current contributions from active dendrites playing regulatory roles in determining ripple characteristics (Sirmaur & Narayanan, 2024).

      Our principal goal in this study, therefore, was to lay the single-neuron foundation for network analyses of the impact of gap junctions on LFPs. We are preparing the network part of the study, with a strong focus on ripple-frequency oscillations, for submission for peer review separately. Please see abstract of our poster presented at the Society for Neuroscience annual meeting 2024 on the topic here: https://tinyurl.com/57ehvsep).

      In the revised manuscript, we have mentioned the results from our SfN abstract with reference to network simulations and high-frequency oscillations, while also presenting discussions from other studies on the role of gap junctions in synchrony and LFP oscillations. The following has been added to the revised manuscript under the Discussion subsection “High-frequency LFP power was suppressed with gap junctional inputs”:

      “In this context, our analyses lay the foundation for network analyses of the impact of gap junctions on LFPs. The conclusions from the single-neuron simulations in this study extend to a neuropil with several neurons, each receiving synaptic and gap junctional inputs, especially with reference to high-frequency ripple oscillations (Sirmaur & Narayanan, 2024). A neuropil made of hundreds of morphologically realistic pyramidal neurons was used to assess the role of different synaptic inputs excitatory, inhibitory, and gap junctional with different patterns of stimulation and active dendritic contributions to ripple-frequency oscillations. Network-based analyses have unveiled a dominant mediatory role of patterned inhibition in ripple generation, with recurrent excitations through chemical synapses and gap junctions, in conjunction with return-current contributions from active dendrites, playing modulatory roles in governing ripple characteristics (Sirmaur & Narayanan, 2024). Future studies could expand on these conclusions to explore the implications of frequency-dependent filtering (with reference to gap junctional coupling) on high-frequency extracellular oscillations.”

      We thank you for highlighting this point as it allowed us to expand on the implications for our analyses to brain rhythms, especially with reference to high-frequency oscillations. Thank you.

      Visualization

      Figures are dense and could benefit from more intuitive labeling and focused presentations. For example, isolating key differences between chemical and gap junctional inputs in distinct panels would improve clarity.

      We thank you for this constructive suggestion. We used the specific visualization throughout, where we place the outcomes associated with chemical synapses and gap junctions in the same figure, adjacent to each other. We believe that this offers visually intuitive distinction between the outcomes for chemical synapses and gap junctions, rather than placing them in different figures. Splitting them would place the outcomes in different figures and requires turning pages or placing two different figures adjacent to each other for quantitative comparison. We respectfully request that we be allowed to retain this form of visualization in the figures. Thank you.

      Contextual Relevance

      The manuscript touches on how these findings relate to known physiological roles of gap junctions (e.g., in gamma rhythms) but does not explore this in depth. Stronger integration of the results into known neural network dynamics would enhance its impact.

      We sincerely appreciate your valuable suggestion and acknowledge the importance of integrating our results into established neural network dynamics, particularly their implications for gamma rhythms. We have addressed this aspect in the revised version of our manuscript. We have added this to the Discussion subsection on “High-frequency LFP power was suppressed with gap junctional inputs” of the revised manuscript:

      “In the context of oscillations and gap-junctional coupling, electrical synapses have been shown to regulate the emergence and stability of the network interactions underlying rhythms of different frequencies, especially gamma-frequency oscillations (Bocian et al., 2009; Buhl et al., 2003; Draguhn et al., 1998; Hormuzdi et al., 2001; Konopacki et al., 2004; LeBeau et al., 2003; Posluszny, 2014; Traub et al., 2003). Specifically, both genetic and pharmacological manipulations of gap junctions have been shown to disrupt gamma rhythms. Genetic deletion of connexin-36 impairs the gamma oscillations associated with awake, active behavioral states (Buhl et al., 2003; Hormuzdi et al., 2001). High-frequency oscillations in the hippocampus have been shown to be sensitive to pharmacological agents like carbenoxolone and octanol that are known to inhibit gap junctions. Carbenoxolone has been known to reduce the transient gamma-frequency oscillations while octanol abolishes the persistent gamma rhythm (Draguhn et al., 1998; Hormuzdi et al., 2001; Posluszny, 2014; Traub et al., 2003). In the context of our results, where we demonstrate that the relative contributions of gap-junctional coupling to high-frequency extracellular potentials is low (Figs. 6–7), how do gap junctions contribute to enhanced extracellular gamma oscillations in these circuits?

      It should be noted that in hippocampal circuits, gamma oscillations emerge predominantly due to interactions between inhibitory interneurons through GABA<sub>A</sub> receptors (Buzsaki & Wang, 2012; Colgin, 2016; Colgin & Moser, 2010; Wang, 2010; Wang & Buzsaki, 1996; Whittington et al., 1995). Thus, the presence of additional gap junctional coupling between these inhibitory neurons allows for tighter synchrony between these reciprocally inhibition-coupled neurons. In other words, the presence of gap junctions increases the probability of action potential generation in other neurons that are electrically coupled to them, together increasing the population of inhibitory neurons that elicit synchronous action potentials. When these synchronous action potentials act on the adjacent cells, both excitatory and inhibitory, the transmembrane GABA<sub>A</sub> receptor currents yield stronger gamma-frequency oscillations in the extracellular potentials (Draguhn et al., 1998; Hormuzdi et al., 2001; Posluszny, 2014; Traub et al., 2003). Thus, the stronger high-frequency oscillations observed in these scenarios is owing to the enhanced synchrony that is brought about the gap-junctional coupling, which translates to stronger transmembrane inhibitory receptor currents.

      These observations also strongly emphasize the utility of the computational approach we took in this study towards discerning the specific roles of gap junctions. Gap junctional coupling have strong physiological roles in terms of enhancing synchronous activity across the neurons that they couple and often express along with other receptors that connect the sets of neurons. Thus, the specific contributions of different neuronal components need to be studied with reference to how they contribute to physiological characteristics vs. their contributions to extracellular potentials. Thus, computational modeling offers an ideal route to understand the specific contributions of different neural-circuit components to extracellular field potentials and rhythms therein (Buzsaki et al., 2012; Einevoll et al., 2019; Einevoll et al., 2013; Sinha & Narayanan, 2022).”

      We thank you for highlighting this point as this allowed us to delineate the impact of gap junctions to regulating synchrony across connected neurons vs. modulating field potentials. Thank you.

      Reviewer #2 (Public review):

      This computational work examines whether the inputs that neurons receive through electrical synapses (gap junctions) have different signatures in the extracellular local field potential (LFP) compared to inputs via chemical synapses. The authors present the results of a series of model simulations where either electric or chemical synapses targeting a single hippocampal pyramidal neuron are activated in various spatio-temporal patterns, and the resulting LFP in the vicinity of the cell is calculated and analyzed. The authors find several notable qualitative differences between the LFP patterns evoked by gap junctions vs. chemical synapses. For some of these findings, the authors demonstrate convincingly that the observed differences are explained by the electric vs. chemical nature of the input, and these results likely generalize to other cell types. However, in other cases, it remains plausible (or even likely) that the differences are caused, at least partly, by other factors (such as different intracellular voltage responses due to, e.g., the unequal strengths of the inputs). Furthermore, it was not immediately clear to me how the results could be applied to analyze more realistic situations where neurons receive partially synchronized excitatory and inhibitory inputs via chemical and electric synapses.

      We gratefully thank you for your time and effort in rigorously assessing our manuscript, for the enthusiastic response, and the encouraging and thoughtful comments on our study. In what follows, we have provided point-by-point responses to the specific comments.

      Strengths

      The main strength of the paper is that it draws attention to the fact that inputs to a neuron via gap junctions are expected to give rise to a different extracellular electric field compared to inputs via chemical synapses, even if the intracellular effects of the two types of input are similar. This is because, unlike chemical synaptic inputs, inputs via gap junctions are not directly associated with transmembrane currents. This is a general result that holds independent of many details such as the cell types or neurotransmitters involved.

      We gratefully thank you for the positive comments and the encouraging words about the novel contributions of our study. We are particularly thankful to you for your comment on the generality of our conclusions that hold for different cell types and neurotransmitters involved.

      Another strength of the article is that the authors attempt to provide intuitive, non-technical explanations of most of their findings, which should make the paper readable also for non-expert audiences (including experimentalists).

      We sincerely thank you for the positive comments about the readability of the paper.

      Weaknesses

      The most problematic aspect of the paper relates to the methodology for comparing the effects of electric vs. chemical synaptic inputs on the LFP. The authors seem to suggest that the primary cause of all the differences seen in the various simulation experiments is the different nature of the input, and particularly the difference between the transmembrane current evoked by chemical synapses and the gap junctional current that does not involve the extracellular space. However, this is clearly an oversimplification: since no real attempt is made to quantitatively match the two conditions that are compared (e.g., regarding the strength and temporal profile of the inputs), the differences seen can be due to factors other than the electric vs. chemical nature of synapses. In fact, if inputs were identical in all parameters other than the transmembrane vs. directly injected nature of the current, the intracellular voltage responses and, consequently, the currents through voltage-gated and leak currents would also be the same, and the LFPs would differ exactly by the contribution of the transmembrane current evoked by the chemical synapse. This is evidently not the case for any of the simulated comparisons presented, and the differences in the membrane potential response are rather striking in several cases (e.g., in the case of random inputs, there is only one action potential with gap junctions, but multiple action potentials with chemical synapses). Consequently, it remains unclear which observed differences are fundamental in the sense that they are directly related to the electric vs. chemical nature of the input, and which differences can be attributed to other factors such as differences in the strength and pattern of the inputs (and the resulting difference in the neuronal electric response).

      We thank you for raising this important point. We would like to emphasize that our experimental design and analyses quantitatively account for the spatial distribution and temporal pattern of specific kinds of inputs that arrive through gap junctions and chemical synapses. We submit that our analyses quantitatively demonstrates that the fundamental difference between the gap junctional and chemical synaptic contributions to extracellular potentials is the absence of the direct transmembrane component from gap junctional inputs. We elucidate these points below:

      (1) Spatial distribution: The inputs were distributed randomly across the basal dendrites, irrespective of whether they were through gap junctions or chemical synapses. For both chemical synapses and gap junctions, the inputs were of the same nature: excitatory.

      (2) Different numbers of inputs: We have presented consistent results for both fewer and more gap junctions or chemical synapses in our analyses (see Figure 1 with 217 gap junctions or 245 chemical synapses and Supplementary Figure 2 with 99 gap junctions or 30 chemical synapses). Our fundamentally novel result that gap junctions onto active dendrites shape LFPs holds true irrespective of the relative density of gap junctions onto the neuron.

      (3) Synchronous inputs (Figs. 1–3): For chemical synapses, the waveforms are in the shape of postsynaptic potentials. For gap junctional inputs, the waveforms are in the shape of postsynaptic potentials or dendritic spikes (to respect the active nature of inputs from the other cell). Here, the electrical response of the postsynaptic cell is identical irrespective of whether inputs arrive through gap junctions or chemical synapses: an action potential. We quantitatively matched the strengths such that the model generated a single action potential in response to synchronous inputs, irrespective of whether they arrived through chemical synaptic and gap junctional inputs. We mechanistically analyzed the contributions of different cellular components and show that the direct transmembrane current in chemical synapses is the distinguishing factor that determines the dichotomy between the contributions of gap junctions vs. chemical synapses to extracellular potentials (Figs. 2–3). In the revised manuscript, we have shown the intracellular responses to demonstrate that they are electrically matched (new Supplementary Figure 3).

      (4) Random inputs (Fig. 4): For random inputs, we did not account for the number of action potentials that arrived, as the only observation we made here was with reference to the biphasic nature of the extracellular potentials with gap junctional inputs in the “No Sodium” scenario. We note that in the “No Sodium” scenario, the time-domain amplitudes were comparable for the field potentials (Fig. 4B, Fig. 4D).

      (5) Rhythmic inputs (Fig. 5–8): For rhythmic inputs, please note that the intracellular and extracellular waveforms for every frequency are provided in supplementary figures S5– S11. It may be noted that the intracellular responses are comparable. In simulations for assessing spike-LFP comparison, we tuned the strengths to produce a single spike per cycle, ensuring fair comparison of LFPs with gap junctions vs. chemical synapses.

      Taken together, we demonstrate through explicit sets of simulations and analyses that the differences in LFPs were not driven by the strength or patterns of the inputs but rather by the differences in direct transmembrane currents, which are subsequently reflected in the LFPs. In the revised manuscript, we have emphasized these points in the Discussion section, apart from providing intracellular traces for cases where they were not provided before (new Supplementary Figure 3):

      Discussion subsection “Dominance of active dendritic currents with LFP associated with gap junctions”

      “Our analyses quantitatively demonstrates that the fundamental difference between the gap junctional and chemical synaptic contributions to extracellular potentials is the absence of the direct transmembrane component from gap junctional inputs. A multitude of factors suggests that the observed LFP differences result not from variations in input strength or patterns but rather from differences in direct transmembrane currents, which are subsequently reflected in the LFP signals.

      First, the inputs were distributed randomly across the basal dendrites, irrespective of whether they were through gap junctions or chemical synapses. For both chemical synapses and gap junctions, the inputs were exclusively excitatory in nature.

      Second, the results remained consistent regardless of the number of gap junctions or chemical synapses. (Fig. 1 with 217 gap junctions or 245 chemical synapses and Supplementary Fig. 2 with 99 gap junctions or 30 chemical synapses). Our fundamentally novel result that gap junctions onto active dendrites shape LFPs holds true irrespective of the relative density of gap junctions onto the neuron.

      Third, for synchronous chemical synaptic inputs, the waveforms resembled typical postsynaptic potentials. Whereas, for gap junctional inputs, the waveforms showed characteristics of postsynaptic potentials or dendritic spikes (accounting the active nature of inputs from the potential presynaptic cells). Electrical response of postsynaptic cell remains identical, producing an action potential regardless of whether inputs arrive via gap junctions or chemical synapses. We quantitatively matched the strengths such that the model generated a single action potential in response to synchronous inputs, irrespective of whether they arrived through chemical synaptic or gap junctional inputs. We mechanistically analyzed the contributions of different cellular components and show that the direct transmembrane current in chemical synapses is the distinguishing factor that determines the dichotomy between the contributions of gap junctions vs. chemical synapses to extracellular potentials (Fig. 23).

      Fourth, for random inputs, the models were not specifically tuned to generate a single action potential. Here, the inputs served as a proxy for asynchronous inputs arriving from other subregions at random times.

      Finally, the intracellular responses were comparable for chemical synaptic and gap junctional rhythmic inputs (Supplementary Fig. S5-S11). Here, the model was tuned to elicit a single spike per cycle in simulations evaluating spike-LFP interactions, ensuring a fair comparison between LFPs from gap junctional and chemical synaptic inputs.”

      We have added a new Supplementary Figure 3 to the revised manuscript and have referred to this figure in the Results subsection. We thank you for raising these points as it allowed to elaborate on the several novelties and implications of our methodology and conclusions. Thank you.

      Some of the explanations offered for the effects of cellular manipulations on the LFP appear to be incomplete. More specifically, the authors observed that blocking leak channels significantly changed the shape of the LFP response to synchronous synaptic inputs - but only when electric inputs were used, and when sodium channels were intact. The authors seemed to attribute this phenomenon to a direct effect of leak currents on the extracellular potential - however, this appears unlikely both because it does not explain why blocking the leak conductance had no effect in the other cases, and because the leak current is several orders of magnitude smaller than the spike-generating currents that make the largest contributions to the LFP. An indirect effect mediated by interactions of the leak current with some voltage-gated currents appears to be the most likely explanation, but identifying the exact mechanism would require further simulation experiments and/or a detailed analysis of intracellular currents and the membrane potential in time and space.

      We thank you for raising this important question. Leak channels were among the several contributors to the positive deflection observed in LFPs associated with gap junctions. This effect was present not only in gap junctional models with intact sodium conductance but also in the no-sodium model, where the amplitude of the positive deflection was reduced across other models as well (Fig. 2F, I). Furthermore, even in the absence of leak conductance, a small positive deflection was still observed (Fig. 2F), leading us to further investigate other transmembrane currents over time and across spatial locations, from the proximal to the distal dendritic ends relative to the soma (Fig. 3D). We had observed that the dominant contributor in the case of chemical synapses was the inward synaptic current (Fig. 3A), whereas for gap junctions, the primary contributors were leak conductance along with other outward currents, such as potassium and HCN currents (Fig. 3D). Together, the direct transmembrane component of chemical synapses provides a dominant contribution to extracellular potentials. This dominance translates to differences in the relative contributions of indirect currents (including leak currents) to extracellular potentials associated chemical synaptic vs. gap junctional inputs. Our analyses of the exact ionic mechanisms (Fig. 3) demonstrates the involvement of several ion channels contributing to the indirect component in either scenario.

      In every simulation experiment in this study, inputs through electric synapses are modeled as intracellular current injections of pre-determined amplitude and time course based on the sampled dendritic voltage of potential synaptic partners. This is a major simplification that may have a significant impact on the results. First, the current through gap junctions depends on the voltage difference between the two connected cellular compartments and is thus sensitive to the membrane potential of the cell that is treated as the neuron "receiving" the input in this study (although, strictly speaking, there is no pre- or postsynaptic neuron in interactions mediated by gap junctions). This dependence on the membrane potential of the target neuron is completely missing here. A related second point is that gap junctions also change the apparent membrane resistance of the neurons they connect, effectively acting as additional shunting (or leak) conductance in the relevant compartments. This effect is completely missed by treating gap junctions as pure current sources.

      We thank you for raising this important point. We agree with the analyses presented by the reviewer on the importance of network simulations and bidirectional gap junctions that respect the voltages in both neurons. However, the complexities of LFP modeling precludes modeling of networks of morphologically realistic models with patterns of stimulations occurring across the dendritic tree. LFP modeling studies predominantly uses “post-synaptic” currents to analyze the impact of different patterns of inputs arriving on to a neuron, even when chemical synaptic inputs are considered. Explicitly, individual neurons are separately simulated with different patterns of synaptic inputs, the transmembrane current at different locations recorded, and the extracellular potential is then computed using line source approximation (Buzsaki et al., 2012; Gold et al., 2006; Halnes et al., 2024; Ness et al., 2018; Reimann et al., 2013; Schomburg et al., 2012; Sinha & Narayanan, 2015, 2022). Even in scenarios where a network is analyzed, a hybrid approach involving the outputs of a pointneuron-based network being coupled to an independent morphologically realistic neuronal model is employed (Hagen et al., 2016; Martinez-Canada et al., 2021; Mazzoni et al., 2015). Given the complexities associated with the computation of electrode potentials arising as a distance-weighted summation of several transmembrane currents, these simplifications becomes essential.

      Our approach models gap junctional currents in a similar way as the other model incorporate synaptic currents in LFP modeling (Buzsaki et al., 2012; Gold et al., 2006; Halnes et al., 2024; Ness et al., 2018; Reimann et al., 2013; Schomburg et al., 2012; Sinha & Narayanan, 2015, 2022). As gap junctions are typically implemented as resistors from the other neuronal compartment, we accounted for gap-junctional variability in our model by randomizing the scaling-factors and the exact waveforms that arrive through individual gap junctions at specific locations. Thus, the inputs were not pre-determined by “pre” neurons. Instead, the recorded voltages from potential synaptic partner neurons were randomized across locations and scaled using factors at the dendrites before being injected into the target neuron (Supplementary Fig. S1). While incorporating a network of interconnected neurons is indeed important, we utilized biophysical, morphologically realistic CA1 neuron model with different sets of input patterns to model LFPs, which were derived from the total transmembrane currents across all compartments of the multi-compartmental neuron model. Given the complexity of this approach, adding further network-level interactions or pre-post connections would have been computationally demanding.

      In the revised manuscript, we have elaborated on the general methodology used in LFP modeling studies to introduce synaptic currents. We have emphasized that our study extends this approach to modeling gap junctional inputs, while also highlighting randomization of locations and the scaling process in assigning gap junctional synaptic strengths. The following paragraphs were specifically added to the revised version of the manuscript:

      Methods subsection “Chemical synaptic and gap junctional inputs: Characteristics and temporal dynamics”:

      “The complexities of LFP modeling precludes modeling of networks of morphologically realistic models with patterns of stimulations occurring across the dendritic tree. LFP modeling studies predominantly uses post-synaptic currents to analyze the impact of different patterns of inputs arriving on to a neuron, even when chemical synaptic inputs are considered. Explicitly, individual neurons are separately simulated with different patterns of synaptic inputs, the transmembrane current at different locations recorded, and the extracellular potential is then computed using line source approximation (Buzsaki et al., 2012; Gold et al., 2006; Halnes et al., 2024; Ness et al., 2018; Reimann et al., 2013; Schomburg et al., 2012; Sinha & Narayanan, 2015, 2022). Even in scenarios where a network is analyzed, a hybrid approach involving the outputs of a point-neuron-based network being coupled to an independent morphologically realistic neuronal model is employed (Hagen et al., 2016; MartinezCanada et al., 2021; Mazzoni et al., 2015). Given the complexities associated with the computation of electrode potentials arising as a distance-weighted summation of several transmembrane currents, these simplifications become essential.”

      “Our approach models gap junctional currents in a similar way as the other model incorporate synaptic currents in LFP modeling (Buzsaki et al., 2012; Gold et al., 2006; Halnes et al., 2024; Ness et al., 2018; Reimann et al., 2013; Schomburg et al., 2012; Sinha & Narayanan, 2015, 2022). As gap junctions are typically implemented as resistors from the other neuronal compartment, we accounted for gap-junctional variability in our model by randomizing the scaling-factors and the exact waveforms that arrive through individual gap junctions at specific locations from potential presynaptic sources.”

      We thank for you highlighting these points as it allowed us to place our methodology in the specific context of the literature. Thank you.

      One prominent claim of the article that is emphasized even in the abstract is that HCN channels mediate an outward current in certain cases. Although this statement is technically correct, there are two reasons why I do not consider this a major finding of the paper. First, as the authors acknowledge, this is a trivial consequence of the relatively slow kinetics of HCN channels: when at least some of the channels are open, any input that is sufficiently fast and strong to take the membrane potential across the reversal potential of the channel will lead to the reversal of the polarity of the current. This effect is quite generic and well-known and is by no means specific to gap junctional inputs or even HCN channels. Second, and perhaps more importantly, the functional consequence of this reversed current through HCN channels is likely to be negligible. As clearly shown in Supplementary Figure S3, the HCN current becomes outward only for an extremely short time period during the action potential, which is also a period when several other currents are also active and likely dominant due to their much higher conductances. I also note that several of these relevant facts remain hidden in Figure 3, both because of its focus on peak values, and because of the radically different units on the vertical axes of the current plots.

      We thank you for raising this point and agree with you on every point. Please note that we do not assert that the outward HCN currents are exclusively associated with gap junctional inputs. Rather, our results show that synchronous inputs generate outward HCN currents in both chemical synapses (Fig. 3B; positive/outward HCN currents, except in the no sodium or leak model) and gap junctions (Fig. 3D; positive/outward HCN currents). We emphasized this in the case of gap junctions because, in the absence of inward synaptic currents, HCN (acting as outward currents with synchronous inputs) contributed to the positive deflection observed in the LFPs. While HCN would also contribute in the case of chemical synapses, its effect was negligible due to the presence of large inward synaptic currents. Since LFPs reflect the collective total transmembrane currents, the dominant contributors differ between these two scenarios, which we aimed to highlight. Since HCN exhibited outward currents in our synchronous input simulations, we have elaborated on this mechanism in the supplementary figure (Fig. S3). Our intention was not to emphasize this effect for only one synaptic mode but rather to highlight HCN's contribution to the positive deflection as one of the contributing factors.

      We agree that HCN currents are relatively small in magnitude; therefore, our conclusions were based on HCN being one of the several contributing factors. Leak conductance and other outward conductances, including HCN currents (Fig. 3D), collectively contribute to the positive deflections observed in the case of gap junctional synchronous inputs.

      In the revised manuscript, we have provided the following clarifications in the Results subsection on” Synchronous inputs: Outward transmembrane currents from active dendrites contribute to positive deflection in extracellular potentials associated with gap junctional inputs”:

      “It is important to note that despite their relatively small magnitude, the outward HCN currents (Fig. 3D) substantially contribute to positive extracellular potential deflections associated with gap junctional inputs (Fig. 2), together with leak and other outward conductances.”

      “While outward HCN currents (Fig. 3B) are also expected to influence LFPs under chemical synaptic input, their impact was minimal due to the predominance of large inward synaptic currents (Fig. 3A). As LFPs reflect the summation of all transmembrane currents, the dominant contributors vary across different modes of synaptic transmission.”

      We thank you for emphasizing this point. This allowed us to expand on the specific roles of HCN channels and potential contributions of the outward nature of the HCN current. We have also expanded the Discussion subsection on “Outward HCN currents regulate extracellular potentials” to elaborate on this aspect as well. Thank you.

      Finally, I missed an appropriate validation of the neuronal model used, and also the characterization of the effects of the in silico manipulations used on the basic behavior of the model. As far as I understand, the model in its current form has not been used in other studies. If this is the case, it would be important to demonstrate convincingly through (preferably quantitative) comparisons with experimental data using different protocols that the model captures the physiological behavior of at least the relevant compartments (in this case, the dendrites and the soma) of hippocampal pyramidal neurons sufficiently well that the results of the modeling study are relevant to the real biological system. In addition, the correct interpretation of various manipulations of the model would be strongly facilitated by investigating and discussing how the physiological properties of the model neuron are affected by these alterations.

      We thank you for raising this important point. The CA1 pyramidal neuronal model used in this study is built with ion-channel models derived from biophysical and electrophysiological recordings from these cells. As mentioned in the Methods section “Dynamics and distribution of active channels” and Supplementary Table S1, models for individual channels, their gating kinetics, and channel distributions across the somatodendritic arbor (wherever known) are all derived from their physiological equivalents. Importantly, these values were derived from previously validated models from the laboratory, which contain these very ion channel models and the exact same morphology (Roy & Narayanan, 2021). Please compare Supplementary Table S1 with Table 1 from (Roy & Narayanan, 2021). Please note that this model was validated against several physiological measurements along the somatodendritic axis (Fig. 1 of (Roy & Narayanan, 2021)).

      In the revised manuscript, we have explicitly mentioned this while also mentioning the different physiological properties that were used for the validation process from (Roy & Narayanan, 2021):

      Methods subsection “Pyramidal neuron model”

      “All parameters and their corresponding values for the neuronal model were derived from previously validated models (Roy & Narayanan, 2021). These CA1 models were validated against several physiological measurements along the somato dendritic axis (Roy & Narayanan, 2021).”

      “These channel distributions and the associated parametric values (Supplementary Table S1) were demonstrated to satisfy 22 different somato-dendritic measurements (Roy & Narayanan, 2021). Specifically, six physiological measurements input resistance, maximal impedance amplitude, resonance frequency, resonance strength, total inductive phase, and back-propagating action potential were validated with respective electrophysiological ranges at three somato-dendritic locations (Soma, ~150 µm dendrite, and ~300 µm dendrite) each (6×3=18 measurements). In addition, action potential firing frequency at each of 100 pA, 150 pA, 200 pA, and 250 pA (4 measurements) were also matched in the model to fall within the respective ranges of corresponding electrophysiological measurements. The electrophysiological ranges of intrinsic measurements were derived from respective somato-dendritic recordings (Malik et al., 2016; Narayanan et al., 2010; Narayanan & Johnston, 2007, 2008; Spruston et al., 1995). Together, the CA1 pyramidal model neuron used in this study was tuned to match several electrophysiological characteristics and ion-channel distributions (Roy & Narayanan, 2021).”

      We thank you for pointing us to this slip in elaborating on how the model was validated. We have now rectified this. Thank you.

      References

      Andersen, P., Morris, R., Amaral, D., Bliss, T., & O'Keefe, J. (2006). The hippocampus book. Oxford University Press.

      Basak, R., & Narayanan, R. (2018). Spatially dispersed synapses yield sharply-tuned place cell responses through dendritic spike initiation. Journal of Physiology, 596(17), 4173-4205. https://doi.org/10.1113/JP275310

      Bedner, P., Steinhauser, C., & Theis, M. (2012). Functional redundancy and compensation among members of gap junction protein families? Biochim Biophys Acta, 1818(8), 1971-1984. https://doi.org/10.1016/j.bbamem.2011.10.016

      Behrens, C. J., Ul Haq, R., Liotta, A., Anderson, M. L., & Heinemann, U. (2011). Nonspecific effects of the gap junction blocker mefloquine on fast hippocampal network oscillations in the adult rat in vitro. Neuroscience, 192, 11-19. https://doi.org/10.1016/j.neuroscience.2011.07.015

      Bocian, R., Posluszny, A., Kowalczyk, T., Golebiewski, H., & Konopacki, J. (2009). The effect of carbenoxolone on hippocampal formation theta rhythm in rats: in vitro and in vivo approaches. Brain Res Bull, 78(6), 290-298. https://doi.org/10.1016/j.brainresbull.2008.10.005

      Buhl, D. L., Harris, K. D., Hormuzdi, S. G., Monyer, H., & Buzsaki, G. (2003). Selective impairment of hippocampal gamma oscillations in connexin-36 knock-out mouse in vivo. J Neurosci, 23(3), 1013-1018. http://www.ncbi.nlm.nih.gov/pubmed/12574431

      Buzsaki, G., Anastassiou, C. A., & Koch, C. (2012). The origin of extracellular fields and currents--EEG, ECoG, LFP and spikes. Nat Rev Neurosci, 13(6), 407-420. https://doi.org/10.1038/nrn3241

      Buzsaki, G., & Wang, X. J. (2012). Mechanisms of gamma oscillations. Annual Review of Neuroscience, Vol 36, 35, 203-225. https://doi.org/10.1146/annurev-neuro-062111150444

      Colgin, L. L. (2016). Rhythms of the hippocampal network. Nat Rev Neurosci, 17(4), 239249. https://doi.org/10.1038/nrn.2016.21

      Colgin, L. L., & Moser, E. I. (2010). Gamma oscillations in the hippocampus. Physiology (Bethesda), 25(5), 319-329. https://doi.org/10.1152/physiol.00021.2010

      Coulon, P., & Landisman, C. E. (2017). The Potential Role of Gap Junctional Plasticity in the Regulation of State. Neuron, 93(6), 1275-1295. https://doi.org/10.1016/j.neuron.2017.02.041

      Das, A., Rathour, R. K., & Narayanan, R. (2017). Strings on a Violin: Location Dependence of Frequency Tuning in Active Dendrites. Front Cell Neurosci, 11, 72. https://doi.org/10.3389/fncel.2017.00072

      Draguhn, A., Traub, R. D., Schmitz, D., & Jefferys, J. G. (1998). Electrical coupling underlies high-frequency oscillations in the hippocampus in vitro. Nature, 394(6689), 189-192. https://doi.org/10.1038/28184

      Einevoll, G. T., Destexhe, A., Diesmann, M., Grun, S., Jirsa, V., de Kamps, M., Migliore, M., Ness, T. V., Plesser, H. E., & Schurmann, F. (2019). The Scientific Case for Brain Simulations. Neuron, 102(4), 735-744. https://doi.org/10.1016/j.neuron.2019.03.027

      Einevoll, G. T., Kayser, C., Logothetis, N. K., & Panzeri, S. (2013). Modelling and analysis of local field potentials for studying the function of cortical circuits. Nat Rev Neurosci, 14(11), 770-785. https://doi.org/nrn3599 [pii] 10.1038/nrn3599

      Gold, C., Henze, D. A., Koch, C., & Buzsaki, G. (2006). On the origin of the extracellular action potential waveform: A modeling study. J Neurophysiol, 95(5), 3113-3128. https://doi.org/10.1152/jn.00979.2005

      Hagen, E., Dahmen, D., Stavrinou, M. L., Linden, H., Tetzlaff, T., van Albada, S. J., Grun, S., Diesmann, M., & Einevoll, G. T. (2016). Hybrid Scheme for Modeling Local Field Potentials from Point-Neuron Networks. Cereb Cortex, 26(12), 4461-4496. https://doi.org/10.1093/cercor/bhw237

      Halnes, G., Ness, T. V., Næss, S., Hagen, E., Pettersen, K. H., & Einevoll, G. T. (2024). Electric Brain Signals: Foundations and Applications of Biophysical Modeling. Cambridge University Press. https://doi.org/DOI: 10.1017/9781009039826

      Hormuzdi, S. G., Pais, I., LeBeau, F. E., Towers, S. K., Rozov, A., Buhl, E. H., Whittington, M. A., & Monyer, H. (2001). Impaired electrical signaling disrupts gamma frequency oscillations in connexin 36-deficient mice. Neuron, 31(3), 487-495. https://doi.org/10.1016/s0896-6273(01)00387-7

      Hussaini, S. A., Kempadoo, K. A., Thuault, S. J., Siegelbaum, S. A., & Kandel, E. R. (2011). Increased size and stability of CA1 and CA3 place fields in HCN1 knockout mice. Neuron, 72(4), 643-653. https://doi.org/10.1016/j.neuron.2011.09.007

      Johnston, D., & Narayanan, R. (2008). Active dendrites: colorful wings of the mysterious butterflies. Trends Neurosci, 31(6), 309-316. https://doi.org/10.1016/j.tins.2008.03.004

      Kessi, M., Peng, J., Duan, H., He, H., Chen, B., Xiong, J., Wang, Y., Yang, L., Wang, G., Kiprotich, K., Bamgbade, O. A., He, F., & Yin, F. (2022). The Contribution of HCN Channelopathies in Different Epileptic Syndromes, Mechanisms, Modulators, and Potential Treatment Targets: A Systematic Review. Front Mol Neurosci, 15, 807202. https://doi.org/10.3389/fnmol.2022.807202

      Kole, M. H., Hallermann, S., & Stuart, G. J. (2006). Single Ih channels in pyramidal neuron dendrites: properties, distribution, and impact on action potential output [Research Support, Non-U.S. Gov't]. J Neurosci, 26(6), 1677-1687. https://doi.org/10.1523/JNEUROSCI.3664-05.2006

      Konopacki, J., Kowalczyk, T., & Golebiewski, H. (2004). Electrical coupling underlies theta oscillations recorded in hippocampal formation slices. Brain Res, 1019(1-2), 270-274. https://doi.org/10.1016/j.brainres.2004.05.083

      Larkum, M. E., Wu, J., Duverdin, S. A., & Gidon, A. (2022). The Guide to Dendritic Spikes of the Mammalian Cortex In Vitro and In Vivo. Neuroscience, 489, 15-33. https://doi.org/10.1016/j.neuroscience.2022.02.009

      LeBeau, F. E., Traub, R. D., Monyer, H., Whittington, M. A., & Buhl, E. H. (2003). The role of electrical signaling via gap junctions in the generation of fast network oscillations. Brain Res Bull, 62(1), 3-13. https://doi.org/10.1016/j.brainresbull.2003.07.004

      Lo, C. W. (1999). Genes, gene knockouts, and mutations in the analysis of gap junctions. Dev Genet, 24(1-2), 1-4. https://doi.org/10.1002/(SICI)1520-6408(1999)24:1/2%3C1::AID-DVG1%3E3.0.CO;2-U

      Lorincz, A., Notomi, T., Tamas, G., Shigemoto, R., & Nusser, Z. (2002). Polarized and compartment-dependent distribution of HCN1 in pyramidal cell dendrites. Nat Neurosci, 5(11), 1185-1193. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Ci tation&list_uids=12389030

      Magee, J. C. (1998). Dendritic hyperpolarization-activated currents modify the integrative properties of hippocampal CA1 pyramidal neurons. J Neurosci, 18(19), 7613-7624. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Ci tation&list_uids=9742133

      Magee, J. C., & Grienberger, C. (2020). Synaptic Plasticity Forms and Functions. Annual Review of Neuroscience, Vol 36, 43, 95-117. https://doi.org/10.1146/annurev-neuro090919-022842

      Major, G., Larkum, M. E., & Schiller, J. (2013). Active properties of neocortical pyramidal neuron dendrites [Review]. Annual Review of Neuroscience, Vol 36, 36, 1-24. https://doi.org/10.1146/annurev-neuro-062111-150343

      Malik, R., Dougherty, K. A., Parikh, K., Byrne, C., & Johnston, D. (2016). Mapping the electrophysiological and morphological properties of CA1 pyramidal neurons along the longitudinal hippocampal axis. Hippocampus, 26(3), 341-361. https://doi.org/10.1002/hipo.22526

      Martinez-Canada, P., Ness, T. V., Einevoll, G. T., Fellin, T., & Panzeri, S. (2021). Computation of the electroencephalogram (EEG) from network models of point neurons. PLoS Comput Biol, 17(4), e1008893. https://doi.org/10.1371/journal.pcbi.1008893

      Mazzoni, A., Linden, H., Cuntz, H., Lansner, A., Panzeri, S., & Einevoll, G. T. (2015). Computing the Local Field Potential (LFP) from Integrate-and-Fire Network Models. PLoS Comput Biol, 11(12), e1004584. https://doi.org/10.1371/journal.pcbi.1004584

      Mishra, P., & Narayanan, R. (2021). Stable continual learning through structured multiscale plasticity manifolds. Curr Opin Neurobiol, 70, 51-63. https://doi.org/10.1016/j.conb.2021.07.009

      Mishra, P., & Narayanan, R. (2025). The enigmatic HCN channels: A cellular neurophysiology perspective. Proteins, 93(1), 72-92. https://doi.org/10.1002/prot.26643

      Moore, J. J., Ravassard, P. M., Ho, D., Acharya, L., Kees, A. L., Vuong, C., & Mehta, M. R. (2017). Dynamics of cortical dendritic membrane potential and spikes in freely behaving rats. Science, 355(6331). https://doi.org/10.1126/science.aaj1497

      Narayanan, R., Dougherty, K. J., & Johnston, D. (2010). Calcium Store Depletion Induces Persistent Perisomatic Increases in the Functional Density of h Channels in Hippocampal Pyramidal Neurons. Neuron, 68(5), 921-935. https://doi.org/10.1016/j.neuron.2010.11.033

      Narayanan, R., & Johnston, D. (2007). Long-term potentiation in rat hippocampal neurons is accompanied by spatially widespread changes in intrinsic oscillatory dynamics and excitability. Neuron, 56(6), 1061-1075. https://doi.org/10.1016/j.neuron.2007.10.033

      Narayanan, R., & Johnston, D. (2008). The h channel mediates location dependence and plasticity of intrinsic phase response in rat hippocampal neurons. J Neurosci, 28(22), 5846-5860. https://doi.org/10.1523/JNEUROSCI.0835-08.2008

      Ness, T. V., Remme, M. W. H., & Einevoll, G. T. (2016). Active subthreshold dendritic conductances shape the local field potential. Journal of Physiology, 594(13), 38093825. https://doi.org/10.1113/JP272022

      Ness, T. V., Remme, M. W. H., & Einevoll, G. T. (2018). h-Type Membrane Current Shapes the Local Field Potential from Populations of Pyramidal Neurons. J Neurosci, 38(26), 6011-6024. https://doi.org/10.1523/jneurosci.3278-17.2018

      Neves, G., Cooke, S. F., & Bliss, T. V. (2008). Synaptic plasticity, memory and the hippocampus: a neural network approach to causality. Nat Rev Neurosci, 9(1), 65-75. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Ci tation&list_uids=18094707

      Nolan, M. F., Malleret, G., Dudman, J. T., Buhl, D. L., Santoro, B., Gibbs, E., Vronskaya, S., Buzsaki, G., Siegelbaum, S. A., Kandel, E. R., & Morozov, A. (2004). A behavioral role for dendritic integration: HCN1 channels constrain spatial memory and plasticity at inputs to distal dendrites of CA1 pyramidal neurons. Cell, 119(5), 719-732. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15550252

      O'Brien, J. (2014). The ever-changing electrical synapse. Curr Opin Neurobiol, 29, 64-72. https://doi.org/10.1016/j.conb.2014.05.011

      O'Keefe, J., & Recce, M. L. (1993). Phase relationship between hippocampal place units and the EEG theta rhythm. Hippocampus, 3(3), 317-330. https://doi.org/10.1002/hipo.450030307

      Pereda, A. E. (2014). Electrical synapses and their functional interactions with chemical synapses. Nat Rev Neurosci, 15(4), 250-263. https://doi.org/10.1038/nrn3708

      Posluszny, A. (2014). The contribution of electrical synapses to field potential oscillations in the hippocampal formation. Front Neural Circuits, 8, 32. https://doi.org/10.3389/fncir.2014.00032

      Reimann, M. W., Anastassiou, C. A., Perin, R., Hill, S. L., Markram, H., & Koch, C. (2013). A biophysically detailed model of neocortical local field potentials predicts the critical role of active membrane currents. Neuron, 79(2), 375-390. https://doi.org/10.1016/j.neuron.2013.05.023

      Rouach, N., Segal, M., Koulakoff, A., Giaume, C., & Avignone, E. (2003). Carbenoxolone blockade of neuronal network activity in culture is not mediated by an action on gap junctions. Journal of Physiology, 553(Pt 3), 729-745. https://doi.org/10.1113/jphysiol.2003.053439

      Roy, A., & Narayanan, R. (2021). Spatial information transfer in hippocampal place cells depends on trial-to-trial variability, symmetry of place-field firing, and biophysical heterogeneities. Neural Netw, 142, 636-660. https://doi.org/10.1016/j.neunet.2021.07.026

      Schomburg, E. W., Anastassiou, C. A., Buzsaki, G., & Koch, C. (2012). The spiking component of oscillatory extracellular potentials in the rat hippocampus. J Neurosci, 32(34), 11798-11811. https://doi.org/10.1523/JNEUROSCI.0656-12.2012

      Seenivasan, P., & Narayanan, R. (2020). Efficient phase coding in hippocampal place cells. Physical Review Research, 2(3), 033393. https://doi.org/10.1103/PhysRevResearch.2.033393

      Sinha, M., & Narayanan, R. (2015). HCN channels enhance spike phase coherence and regulate the phase of spikes and LFPs in the theta-frequency range. Proc Natl Acad Sci U S A, 112(17), E2207-2216. https://doi.org/10.1073/pnas.1419017112

      Sinha, M., & Narayanan, R. (2022). Active Dendrites and Local Field Potentials: Biophysical Mechanisms and Computational Explorations. Neuroscience, 489, 111-142. https://doi.org/10.1016/j.neuroscience.2021.08.035

      Sirmaur, R., & Narayanan, R. (2024). Distinct extracellular signatures of chemical and electrical synapses impinging on active dendrites differentially contribute to ripplefrequency oscillations. Society for Neuroscience annual meeting (https://www.abstractsonline.com/pp8/?_gl=1*1bxo7m*_gcl_au*MTc5MTQ0NjE0NC4xNzI3MDcwOTMw*_ga*MTMxMTE5OTcyMy4xNzI3MDcwOTMx*_ga_T09K 3Q2WDN*MTcyNzA3MDkzMS4xLjEuMTcyNzA3MDkzNy41NC4wLjA.#!/20433/ presentation/13949), Chicago, USA.

      Spruston, N., Schiller, Y., Stuart, G., & Sakmann, B. (1995). Activity-dependent action potential invasion and calcium influx into hippocampal CA1 dendrites [Research Support, Non-U.S. Gov't]. Science, 268(5208), 297-300. http://www.ncbi.nlm.nih.gov/pubmed/7716524

      Stuart, G. J., & Spruston, N. (2015). Dendritic integration: 60 years of progress. Nat Neurosci, 18(12), 1713-1721. https://doi.org/10.1038/nn.4157

      Szarka, G., Balogh, M., Tengolics, A. J., Ganczer, A., Volgyi, B., & Kovacs-Oller, T. (2021). The role of gap junctions in cell death and neuromodulation in the retina. Neural Regen Res, 16(10), 1911-1920. https://doi.org/10.4103/1673-5374.308069

      Traub, R. D., Cunningham, M. O., Gloveli, T., LeBeau, F. E., Bibbig, A., Buhl, E. H., & Whittington, M. A. (2003). GABA-enhanced collective behavior in neuronal axons underlies persistent gamma-frequency oscillations. Proc Natl Acad Sci U S A, 100(19), 11047-11052. https://doi.org/10.1073/pnas.1934854100

      Vaughn, M. J., & Haas, J. S. (2022). On the Diverse Functions of Electrical Synapses. Front Cell Neurosci, 16, 910015. https://doi.org/10.3389/fncel.2022.910015

      Wang, X. J. (2010). Neurophysiological and computational principles of cortical rhythms in cognition. Physiol Rev, 90(3), 1195-1268. https://doi.org/90/3/1195 [pii] 10.1152/physrev.00035.2008

      Wang, X. J., & Buzsaki, G. (1996). Gamma oscillation by synaptic inhibition in a hippocampal interneuronal network model. J Neurosci, 16(20), 6402-6413. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=8815919

      Whittington, M. A., Traub, R. D., & Jefferys, J. G. (1995). Synchronized oscillations in interneuron networks driven by metabotropic glutamate receptor activation. Nature, 373(6515), 612-615. https://doi.org/10.1038/373612a0

      Williams, S. R., & Stuart, G. J. (2000). Site independence of EPSP time course is mediated by dendritic I(h) in neocortical pyramidal neurons [In Vitro]. J Neurophysiol, 83(5), 3177-3182. http://www.ncbi.nlm.nih.gov/pubmed/10805715

    1. eLife Assessment

      This timely and fundamental study presents an innovative iPSC based co-culture system to model Kupffer cell-hepatocyte interactions and hepatotoxicity, demonstrating reciprocal acquisition of tissue identity and enhanced hepatocyte maturation. The work is convincing, supported by well-executed methodology and functional validation, including physiologically relevant, concentration-dependent hepatotoxic responses. The research approach is promising and of broad interest, further clarification of experimental design and interpretation may strengthen its impact.

    2. Reviewer #1 (Public review):

      The manuscript presents a compelling new in vitro system based on isogenic co-cultures of human iPSC-derived hepatocytes and macrophages, enabling the modelling of hepatic immune responses with unprecedented physiological relevance. The authors show that co-culture leads to enhanced maturation of hepatocytes and tissue-resident macrophage identity, which cannot be achieved through conditioned media alone. Using this system, they functionally validate immune-driven hepatotoxic responses to a panel of drugs and compare the system's predictive power to that of monocyte-derived macrophages. The results underscore the necessity of macrophage-hepatocyte crosstalk for accurate modelling of liver inflammation and drug toxicity in vitro. The manuscript is clearly written and addresses a key limitation in liver organoid systems: the lack of immune complexity and tissue-specific macrophage imprinting.

      Strengths:

      • Novelty and Relevance: The study presents a highly innovative co-culture system based on isogenic human iPSCs, addressing an unmet need in modelling immune-mediated hepatotoxicity.

      • Mechanistic Insight: The reciprocal reprogramming between iHeps and iMacs, including induction of KC-specific pathways and hepatocyte maturation markers, is convincingly demonstrated.

      • Functional Readouts: The application of the model to detect IL-6 responses to hepatotoxic compounds enhances its translational relevance.

      Weaknesses:

      The co-culture model with monocyte-derived macrophages is not fully characterised, making comparisons less informative.

    3. Reviewer #3 (Public review):

      Summary:

      In this study, the authors establish a human in vitro liver model by co-culturing induced hepatocyte-like cells (iHEPs) with induced macrophages (iMACs). Through flow cytometry-based sorting of cell populations at days 3 and 7 of co-culture, followed by bulk RNA sequencing, they demonstrate that bidirectional interactions between these two cell types drive functional maturation. Specifically, the presence of iMACs accelerates the hepatic maturation program of iHEPs, while contact-dependent cues from iHEPs enhance the acquisition of Kupffer cell identity in iMACs, indicating that direct cell-cell interactions are critical for establishing tissue-resident macrophage characteristics.

      Functionally, the authors show that iMAC-derived Kupffer-like cells respond to pathological stimuli by producing interleukin-6 (IL-6), a hallmark cytokine of hepatic immune activation. When exposed to a panel of clinically relevant hepatotoxic drugs, the co-culture system exhibited concentration-dependent modulation of IL-6 secretion consistent with reported drug-induced liver injury (DILI) phenotypes. Notably, this response was absent when hepatocytes were co-cultured with monocyte-derived macrophages from peripheral blood, underscoring the liver-specific phenotype and functional relevance of the iMAC-derived Kupffer-like cells. Collectively, the study proposes this co-culture platform as a more physiologically relevant model for interrogating macrophage-hepatocyte crosstalk and assessing immune-mediated hepatotoxicity in vitro.

      Strengths:

      A major strength of this study lies in its systematic dissection of cell-cell interactions within the co-culture system. By isolating each cell type following co-culture and performing comprehensive transcriptomic analyses, the authors provide direct evidence of bidirectional crosstalk between iMACs and iHEPs. The comparison with single-culture controls is particularly valuable, as it clearly demonstrates how co-culture enhances functional maturation and lineage-specific gene expression in both cell types. This approach allows for a more mechanistic understanding of how hepatocyte-macrophage interactions contribute to the acquisition of tissue-specific phenotypes

      Weaknesses:

      (1) Overreliance on bulk RNA-seq data:

      The primary evidence supporting cell maturation is derived from bulk RNA sequencing, which has inherent limitations in resolving heterogeneous cellular states and functional maturation. The conclusions regarding hepatocyte maturation are based largely on increased expression of a subset of CYP genes and decreased AFP levels - markers that, while suggestive, are insufficient on their own to substantiate functional maturation. Additional phenotypic or functional assays (e.g., metabolic activity, protein-level validation) would significantly strengthen these claims.

      (2) Insufficient characterization of input cell populations:

      The manuscript lacks adequate validation of the cellular identities prior to co-culture. Although the authors reference previously published protocols for generating iHEPs and iMACs, it remains unclear whether the cells used in this study faithfully retain expected lineage characteristics. For example, hepatocyte preparations should be characterized by flow cytometry for ALB and AFP expression, while iMACs should be assessed for canonical macrophage markers such as CD45, CD11b, and CD14 before co-culture. Without these baseline data, it is difficult to interpret the magnitude or significance of any co-culture-induced changes.

      (3) Quantitative assessment of IL-6 production is insufficient:

      The analysis of drug-induced IL-6 responses is based primarily on relative changes compared to control conditions. However, percentage changes alone are inadequate to capture the biological relevance of these responses. Absolute cytokine production levels - particularly in response to LPS stimulation - should be reported and directly compared to PBMC-derived macrophages to determine whether iMAC-derived Kupffer-like cells exhibit enhanced cytokine output. Moreover, the Methods section should clearly describe how ELISA results were normalized or corrected to account for potential differences in cell number, viability, or culture conditions.

      (4) Unclear mechanistic interpretation of IL-6 modulation:

      The observed changes in IL-6 production upon drug treatment cannot be interpreted solely as evidence of Kupffer cell-specific functionality. For instance, IL-6 suppression by NSAIDs such as diclofenac is well known to result from altered prostaglandin synthesis due to COX inhibition, while leflunomide's effects are linked to metabolite-induced modulation of immune cell proliferation and broader cytokine networks. These mechanisms are distinct from Kupffer cell identity and may not directly reflect liver-specific macrophage function. Consequently, changes in IL-6 secretion alone - particularly without additional mechanistic evidence or analysis of other cytokines - are insufficient to conclude that co-culture with hepatocytes drives the acquisition of bona fide Kupffer cell maturity.

      Reviewers comments to revised manuscript.

      The authors successfully established an isogenic, iPSC-derived human liver co-culture model to investigate the role of hepatocyte-macrophage interactions in driving Kupffer cell (KC) identity and hepatocyte maturation. By utilizing a single genetic background, the authors effectively minimized the experimental variability often encountered in non-isogenic systems. A significant highlight of this work is the demonstration that direct co-culture-as opposed to conditioned media alone-is a primary driver for critical KC identity markers such as ID1 and ID3. Furthermore, the model's ability to recapitulate complex clinical IL-6 responses to known hepatotoxicants where standard models have failed underscores its potential utility for early-stage DILI screening. However, there are significant methodological concerns regarding the data analysis. While the study compares four or five distinct experimental groups (e.g., Day 0, Day 7, Day 3 co-culture, and Day 7 co-culture), the authors utilized Student's t-tests for these comparisons. This approach does not account for the multiple comparisons problem and increases the risk of Type I errors. Additionally, while IL-6 secretion is used as a primary functional readout, the individual mechanisms behind these drug responses were not explored experimentally. Finally, Pearson correlation analysis indicates that the iMacs remain poorly correlated with actual in vivo human embryonic liver macrophages, suggesting that the "imprinting" of true KC identity remains incomplete.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      The manuscript presents a compelling new in vitro system based on isogenic co-cultures of human iPSC-derived hepatocytes and macrophages, enabling the modelling of hepatic immune responses with unprecedented physiological relevance. The authors show that co-culture leads to enhanced maturation of hepatocytes and tissue-resident macrophage identity, which cannot be achieved through conditioned media alone. Using this system, they functionally validate immune-driven hepatotoxic responses to a panel of drugs and compare the system's predictive power to that of monocyte-derived macrophages. The results underscore the necessity of macrophage-hepatocyte crosstalk for accurate modelling of liver inflammation and drug toxicity in vitro.

      The manuscript is clearly written and addresses a key limitation in liver organoid systems: the lack of immune complexity and tissue-specific macrophage imprinting. Nevertheless, several conclusions would benefit from a more careful interpretation of the data, and some important controls or explanations are missing, particularly in the flow cytometry gating strategies, stress marker validation, and cluster interpretations.

      Strengths:

      (1) Novelty and Relevance: The study presents a highly innovative co-culture system based on isogenic human iPSCs, addressing an unmet need in modelling immune-mediated hepatotoxicity.

      (2) Mechanistic Insight: The reciprocal reprogramming between iHeps and iMacs, including induction of KC-specific pathways and hepatocyte maturation markers, is convincingly demonstrated.

      (3) Functional Readouts: The application of the model to detect IL-6 responses to hepatotoxic compounds enhances its translational relevance.

      Weaknesses:

      (1) Several key claims, particularly those derived from PCA plots and DEG analyses, are overinterpreted and require more conservative language or further validation.

      We agree that PCA does not allow for maturation trajectories and mentioned that it was a hypothesis that the co-culture was promoting maturation, which we later validated by looking at the expression of key hepatocyte markers as well as by pearson correlation comparison with fetal hepatocytes.

      (2) The purity of sorted hepatocytes and macrophages is not convincingly demonstrated; contamination across gates may confound transcriptomic readouts.

      We agree and have highlighted and addressed this limitation in our discussion. Unfortunately, this is a limitation of bulk sequencing that a small amount of contamination might be present, however the TPM values of ALB for example in the iMacs is extremely low especially when compared to the hepatocytes, indicating that the level of contamination is likely to be very low. Likewise, the expression of CSF1R in the co-cultured iHeps is also extremely low. This has been included in Supp Fig 1F and G.

      (3) Stress response genes and ER stress/apoptosis signatures are not properly assessed, despite being potentially activated in the system.

      This has been included in Supp Fig 2C, where we’ve included the expression of ATF4, CASP3 and CASP9. Although there’s a significant difference in ATF4 expression between Day 0 and Day 7 iHep only/Co-culture, there is no significant difference between the Day 7 iHep only and Day 7 iHep Co-culture. There are no significant differences in CASP3 and CASP9 expression across all the samples.

      (4) Some figure panels and legends lack statistical annotations, and microscopy validation of morphological changes is missing.

      Although we agree that the morphology changes would be interesting, we think that this question is unfortunately outside of the scope of our question. Although Kupffer cells are in direct contact with hepatocytes, they migrate from the liver parenchyma into the sinusoidal spaces where they primarily reside. We do not think that the morphology would add much to the paper, especially given that this is a 2D model as well.

      (5) The co-culture model with monocyte-derived macrophages is not fully characterised, making comparisons less informative.

      Although we agree that it would be interesting to look more closely at the monocyte-derived macrophage co-cultures as well, we think that this would be more suited to a future study as the transcriptomic analysis would likely include confounding effects of patient specific transcriptomic changes, and our primary focus was on developing an isogenic co-culture system.

      Reviewer #2 (Public review):

      Summary:

      This study builds on work by Glass and Guilliams showing that mouse Kupffer cells depend on the surrounding cells, including endothelium, hepatocytes, and stellate cells, for their identity. Herein, the authors extend the work to human systems. It nicely highlights why taking monocyte-derived macrophages and pretending they are Kupffer cells is simply misleading.

      Strengths:

      Many, including human cells, difficult culture assays, and important new data.

      Weaknesses:

      This reviewer identified minor queries only, rather than 'weaknesses' as such.

      Reviewer #3 (Public review):

      Summary:

      In this study, the authors establish a human in vitro liver model by co-culturing induced hepatocyte-like cells (iHEPs) with induced macrophages (iMACs). Through flow cytometry-based sorting of cell populations at days 3 and 7 of co-culture, followed by bulk RNA sequencing, they demonstrate that bidirectional interactions between these two cell types drive functional maturation. Specifically, the presence of iMACs accelerates the hepatic maturation program of iHEPs, while contact-dependent cues from iHEPs enhance the acquisition of Kupffer cell identity in iMACs, indicating that direct cell-cell interactions are critical for establishing tissue-resident macrophage characteristics.

      Functionally, the authors show that iMAC-derived Kupffer-like cells respond to pathological stimuli by producing interleukin-6 (IL-6), a hallmark cytokine of hepatic immune activation. When exposed to a panel of clinically relevant hepatotoxic drugs, the co-culture system exhibited concentration-dependent modulation of IL-6 secretion consistent with reported drug-induced liver injury (DILI) phenotypes. Notably, this response was absent when hepatocytes were co-cultured with monocyte-derived macrophages from peripheral blood, underscoring the liver-specific phenotype and functional relevance of the iMAC-derived Kupffer-like cells. Collectively, the study proposes this co-culture platform as a more physiologically relevant model for interrogating macrophage-hepatocyte crosstalk and assessing immune-mediated hepatotoxicity in vitro.

      Strengths:

      A major strength of this study lies in its systematic dissection of cell-cell interactions within the co-culture system. By isolating each cell type following co-culture and performing comprehensive transcriptomic analyses, the authors provide direct evidence of bidirectional crosstalk between iMACs and iHEPs. The comparison with single-culture controls is particularly valuable, as it clearly demonstrates how co-culture enhances functional maturation and lineage-specific gene expression in both cell types. This approach allows for a more mechanistic understanding of how hepatocyte-macrophage interactions contribute to the acquisition of tissue-specific phenotypes.

      Weaknesses:

      (1) Overreliance on bulk RNA-seq data:

      The primary evidence supporting cell maturation is derived from bulk RNA sequencing, which has inherent limitations in resolving heterogeneous cellular states and functional maturation. The conclusions regarding hepatocyte maturation are based largely on increased expression of a subset of CYP genes and decreased AFP levels - markers that, while suggestive, are insufficient on their own to substantiate functional maturation. Additional phenotypic or functional assays (e.g., metabolic activity, protein-level validation) would significantly strengthen these claims.

      We have added a discussion on the limitations of our study.

      (2) Insufficient characterization of input cell populations:

      The manuscript lacks adequate validation of the cellular identities prior to co-culture. Although the authors reference previously published protocols for generating iHEPs and iMACs, it remains unclear whether the cells used in this study faithfully retain expected lineage characteristics. For example, hepatocyte preparations should be characterized by flow cytometry for ALB and AFP expression, while iMACs should be assessed for canonical macrophage markers such as CD45, CD11b, and CD14 before co-culture. Without these baseline data, it is difficult to interpret the magnitude or significance of any co-culture-induced changes.

      We apologise for this oversight, some of the markers were used in determining the purity of the iMacs before co-culture, and we did not end up including these plots for brevity. We have added the purity plots in Supp Fig 2E now, showing that the iMacs were more than 90% pure before co-culture. We acknowledge the concern about cross-contamination for bulk sequencing, and have added in Supp Fig 2G and H the expression of ALB in the iMac fraction, as well as the expression of CSF1R in the iHep fraction, showing minimal contamination with our gating strategy.

      (3) Quantitative assessment of IL-6 production is insufficient:

      The analysis of drug-induced IL-6 responses is based primarily on relative changes compared to control conditions. However, percentage changes alone are inadequate to capture the biological relevance of these responses. Absolute cytokine production levels - particularly in response to LPS stimulation - should be reported and directly compared to PBMC-derived macrophages to determine whether iMAC-derived Kupffer-like cells exhibit enhanced cytokine output. Moreover, the Methods section should clearly describe how ELISA results were normalized or corrected to account for potential differences in cell number, viability, or culture conditions.

      We apologise if this was unclear. The cytokine production from dosed cells was normalized based on the viability of cells measured from the same well.

      (4) Unclear mechanistic interpretation of IL-6 modulation:

      The observed changes in IL-6 production upon drug treatment cannot be interpreted solely as evidence of Kupffer cell-specific functionality. For instance, IL-6 suppression by NSAIDs such as diclofenac is well known to result from altered prostaglandin synthesis due to COX inhibition, while leflunomide's effects are linked to metabolite-induced modulation of immune cell proliferation and broader cytokine networks. These mechanisms are distinct from Kupffer cell identity and may not directly reflect liver-specific macrophage function. Consequently, changes in IL-6 secretion alone - particularly without additional mechanistic evidence or analysis of other cytokines - are insufficient to conclude that co-culture with hepatocytes drives the acquisition of bona fide Kupffer cell maturity.

      We fully agree with the reviewer and have highlighted this in our discussion.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) GSE ID for RNA-seq data has not been provided.

      This has been included.

      (2) Line 291: Can the authors specify what they mean by "state-of-the-art"?

      What we mean here is what others in the field have also recently described. We have rewritten this to be clearer.

      (3) Lines 299-300: check sentence for grammar mistakes.

      We have rewritten and clarified this.

      (4) Figure 1B: The PCA does not really allow for following maturation trajectories. Also, all samples (day 3 Co-iHep, day 7 Co-iHep, day 7 iHep) look as if they cluster more or less together. Therefore, the conclusion drawn in lines 303-305 does not hold. Why is day 3 iHep not also shown here?

      We agree that PCA does not allow for maturation trajectories and mentioned that it was a hypothesis that the co-culture was promoting maturation, which we later validated by looking at the expression of key hepatocyte markers as well as by pearson correlation comparison with fetal hepatocytes.

      (5) Can the authors show that the cells that they are sorting in the double negative gate are indeed hepatocytes? Typically, these cells are big in cell size; therefore, showing the FSC/SSC gate would also be important.

      We have added the FSC/SSC gate in supp fig. 1E to show that the populations have different sizes.

      (6) Can the authors provide microscopy pictures of iHeps, iMacs, and the co-cultured cells for the reader to appreciate whether the morphology of cells already changes during the co-culture experiments?

      Although we agree that the morphology changes would be interesting, we think that this question is unfortunately outside of the scope of our question. Although Kupffer cells are in direct contact with hepatocytes, they migrate from the liver parenchyma into the sinusoidal spaces where they primarily reside. We do not think that the morphology would add much to the paper, especially given that this is a 2D model as well.

      (7) Please show expression of apoptotic and ER stress genes comparing Day7 iHeps and Co-iHeps, since genes such as c-Fos and Ppp2r3b can also be associated with cellular stress.

      This has been included in Supp Fig 2C, where we’ve included the expression of ATF4, CASP3 and CASP9. Although there’s a significant difference in ATF4 expression between Day 0 and Day 7 iHep only/Co-culture, there is no significant difference between the Day 7 iHep only and Day 7 iHep Co-culture. There are no significant differences in CASP3 and CASP9 expression across all the samples.

      (8) In addition to the genes shown in Figure 1E, could the authors extract a longer gene list of maturing hepatocytes and display them all in bar graphs or heatmaps, or similar? E.g., Albumin expression is shown later, but why not show it already here?

      There are not many differences in the canonical hepatocyte markers, which is why we chose only to show the interesting genes that were different, as seen in the later ALB expression plot where there wasn’t a difference in ALB expression after 7 days of co-culture. Instead, we have included a new heatmap in Supp Fig 2B showing the top 40 genes that are contributing to the similarity by pearson correlation.

      (9) Along these lines, how do the authors ensure that they are culturing only hepatocytes and do not have a mixture of cells that may "dilute" the hepatocyte signature?

      Unfortunately, this is an limitation of our methodology, although the expression of key hepatic markers are routinely confirmed by qPCR to ensure that the majority of the cells are hepatocyte-like.

      (10) Lines 347-350: similar to the interpretation of the PCA for hepatocytes, this is a completely random interpretation. The expression of ALB in the co-cultured iMacs indicates that there are some hepatocytes that ended up in the macrophage gate.

      We agree and have highlighted and addressed this limitation in our discussion. Unfortunately, this is a limitation of bulk sequencing that a small amount of contamination might be present, however the TPM values of ALB for example in the iMacs is extremely low especially when compared to the hepatocytes, indicating that the level of contamination is likely to be very low. Likewise, the expression of CSF1R in the co-cultured iHeps is also extremely low. This has been included in Supp Fig 1F and G.

      (11) Figure 2D: Among the pathways shown, there are also stress pathways (acute phase response, HMGB1). Also for these cells, control of apoptotic and ER stress signatures is necessary.

      As mentioned, we have included some stress genes in Supp Fig 2C to address this.

      (12) Lines 385-386: Why would FCGRA3 indicate tissue residency? Is there literature to support this statement?

      CD16 is a marker often used to distinguish Kupffer cells from the surrounding cells, although it also expressed by non-classical monocytes, we have clarified the text here (Lines 356-357).

      (13) Figure 3E: ALB and other genes were at the same or even lower levels expressed in D7 compared to D3. Why is that? Are the cells starting to de-differentiate after 7 days? Please discuss.

      This is a very interesting question that we were wondering ourselves as well, although sadly we do not have an answer yet. We hypothesized that this might be due to the activation of cell proliferation/developmental programmes as the cells are kept longer together, as shown by the expression of morphogens like OSM and IGF-2 after co-culture. We have added some discussion for this (Lines 532-540)

      (14) Line 459: Word "in" is double

      We thank the reviewer for catching this, this has been corrected

      (15) Figure 5: The findings are interesting, but the co-culture model remains somewhat unclear. Can the authors show, e.g., using qRT-PCR, how hepatocytes are developing in this culture system? If the development with monocyte-derived macrophages is altered, then one would expect that also the cellular response is different.

      We agree with the reviewer, but we think that this question would be better answered in a follow-up study. We were looking to answer if the addition of isogenic iMacs would change the drug response of iHeps, and were using the PBMC-derived macrophages here as a control. A more complete study taking into account the genetic background of the donor PBMC-derived macrophages would be much more informative, but sadly outside of the scope of our present study.

      (16) Lines 482-484: The authors talk about LPS-treated cultures and refer to Figure 4. However, there is no graph shown for LPS.

      We apologise for being unclear here, but the co-cultures were co-treated with LPS during the drug stimulation assays, as it had been shown that LPS increases the sensitivity of the liver toward hepatotoxic drugs. We have clarified this in the main text (Lines 435-437).

      Reviewer #2 (Recommendations for the authors):

      (1) It would be nice to add some protein production by the hepatocytes. For example, can they produce albumin or some other protein that can be measured? Perhaps I missed this.

      The protein expression of Albumin and Urea were assessed in the hepatocytes prior to co-culture in Supp Fig 1C; however we did not measure the protein level changes after co-culture as the co-culture would have a significant number of macrophages as well which we thought might affect the readout. Instead, after co-culture the primary analysis was done on the RNA levels of ALB and other cytochrome genes after sorting in Fig 3.

      (2) Was there an increase in hepatocyte number? Did one cell outgrow the other, or did they maintain numbers?

      The relative proportion of the iHeps remained the same, although we did see an expansion in the iMac population after 7 days by flow cytometry in Fig 1D.

      (3) What happens if the iMACs and the iHeps are grown in Costar chambers with pore sizes too small to allow for cell contact, but allowing supernatant to be continuously exposed to both cell types?

      We were primarily focused on the acquisition of KC-like phenotype in the iMacs with regards the question of direct contact, which was why we chose to use conditioned iHep media as part of the iMac experimental set up. However, it would be very interesting to see if the converse is also true, and whether secreted factors from the iMacs alone would be sufficient to drive the changes we observed in the iHeps after co-culture in a follow-up study.

      (4) The discussion could use a brief paragraph on some limitations and what could be added to the co-culture system. For example, could stellate cells and sinusoidal endothelium also impart KC identity? Would growing KCs on endothelium provide a more natural substratum?

      Once again, these are very interesting questions which are unfortunately outside of the scope of our study. However, we have included a short section discussing this in the paper, as we do think that it would be interesting to look at iMacs educated by hepatocyte vs stellate cells for example (Lines 530-536).

      (5) The axonal guidance pathway in early iMACs is interesting. A recent report in vivo showed that macrophages migrate from the liver parenchyma into the sinusoids in neonates when they are still immature. The process could be chemotaxis, or it could be repulsion by parenchyma. Numerous axonal guidance molecules are repulsive, pushing axons away (robo/slit, etc). The migration of Kupffer cells into sinusoids could be a repulsive rather than a chemoattractant pathway. Did the RNA seq data provide any interesting molecules in this regard?

      Reviewer #3 (Recommendations for the authors):

      This manuscript presents a conceptually well-designed approach to modeling hepatocyte-macrophage crosstalk in vitro. The authors develop a co-culture system aimed at recapitulating key aspects of Kupffer cell (KC) identity and hepatocyte maturation. The data convincingly show that macrophages acquire KC-like features under co-culture conditions. However, several major issues limit the strength of the conclusions, the depth of mechanistic insight, and the translational impact of the work.

      First, the study relies heavily on bulk RNA-seq data with minimal functional or protein-level validation - particularly for hepatocyte maturation. To substantiate claims of functional maturation, additional assays measuring albumin secretion, urea production, and CYP activity are essential. Furthermore, the omission of zonation-associated markers (e.g., GLUL, CPS1, CYP2E1) leaves a critical gap in assessing whether the iHEPs achieve physiologically relevant functional states.

      Second, statistical interpretation and reporting are inconsistent. Significant and non-significant findings are frequently conflated, which risks overinterpretation. For instance, the reported reduction in HNF4A expression is not statistically significant, and AFP expression is only significantly reduced in Day 7 co-iHEPs - yet these distinctions are not clearly stated.

      Third, although the authors emphasize the role of cell-cell contact in promoting KC identity, no experiments (e.g., transwell separation, adhesion-blocking assays) directly test this claim. As a result, the mechanistic basis for this conclusion remains speculative.

      Finally, while the data support enhanced macrophage differentiation toward a KC-like phenotype, the evidence that co-culture significantly promotes hepatocyte maturation is far less convincing and requires additional functional, mechanistic, and statistical validation before firm conclusions can be drawn.

      Minor comments:

      (1) Methodology: The choice of a 2.5:1 iHEP:iMAC ratio is not justified. This proportion does not reflect physiological hepatocyte-to-KC ratios in vivo and should be either rationalized or benchmarked against native liver composition.

      We admit that the ratio here is on the higher side of things, but it has been previously reported that there can be between 20 to 40 macrophages per 100 hepatocytes (1:5 to 1:2.5) in the adult mouse liver (Baratta et al., 2009), while admittedly in the developing mouse liver the ratio is closer to 1:4 (Lopez et al., 2011). We chose 1:2.5 as we anticipated that not all of the macrophages would be able to attach, and would thus be lost during media change, as evident by the flow cytometry of the co-culture on Day 3 of the co-culture, where only 20% of the cells had clear CD45 and CD14 expression. We have clarified our methodology in paper (Lines 141-143).

      (2) Effect of iMAC on iHEP (Section 3.2, Supplementary Figure 1E):

      (2.1) The authors should explain why Day 3 co-cultured iHEPs show stronger transcriptomic similarity to primary hepatocytes than Day 7 cells. Possible biological mechanisms (e.g., transient paracrine signaling or temporal changes in maturation dynamics) should be discussed.

      We have added some discussion for this (Lines 309-311, 536-540).

      (2.2) The figure legend refers to "fetal hepatocytes," while the correlation map states "hepatocytes." This discrepancy must be clarified. Moreover, if fetal hepatocytes are used as the reference, and the goal is to assess maturation, comparisons to adult hepatocytes are necessary. 

      The comparison was done against fetal hepatocytes, and has been clarified in the figure. We chose to use fetal hepatocytes here as it would be unfair to compare iPSC-derived cells that are less than 3 weeks old to adult human tissue, and any similarity or differences between the mono/co-cultures to the adult tissue might be due to the shifting transcriptomic landscape during development. However, we do recognise the nuanced nature of using “maturation” here, and what we mean is that the iPSC-derived cells become more similar to their in-vivo counterparts.

      (2.3) Baseline characterization of both cell types before co-culture is insufficient. For iHEPs, flow cytometry data on ALB and AFP positivity rates should be presented, along with post-co-culture changes. For iMACs, marker expression (CD45, CD11b, CD14) should be shown before and after co-culture. The methods mention CD163, CX3CR1, and CD11b, but these data are absent from the results. Additionally, the gating strategy for cell sorting prior to bulk RNA-seq must be clearly described - including how potential cross-contamination of cell fractions (e.g., macrophages in the hepatocyte population) was excluded.

      We apologise for this oversight, some of the markers were used in determining the purity of the iMacs before co-culture, and we did not end up including these plots for brevity. We have added the purity plots in Supp Fig 2E now, showing that the iMacs were more than 90% pure before co-culture. We acknowledge the concern about cross-contamination for bulk sequencing, and have added in Supp Fig 2G and H the expression of ALB in the iMac fraction, as well as the expression of CSF1R in the iHep fraction, showing minimal contamination with our gating strategy.

      (3) IGF2 Expression: The observed upregulation of IGF2, a fetal marker, contradicts the conclusion that co-culture promotes hepatocyte maturation. This inconsistency should be addressed, and possible explanations (e.g., transient fetal-like activation driven by macrophage-derived signals) discussed. The lack of statistical significance for this finding must also be explicitly noted.

      We thank the reviewer for pointing this out. The expression of IGF2 was actually significantly different when comparing the Day 0 Hepatocyte only and Day 7 Hepatocyte only to the Day 3 Co-cultured Hepatocytes, but the significance is lost with the Day 7 co-cultured Hepatocytes. One possible explanation is as the reviewer suggested, that there is a transient program that is activated upon co-culture that is subsequently downregulated. We have updated the figure and text, and added some discussion to reflect this (Lines 309-311, 536-540).

      (4) Effect of iHEP on iMAC: The reported upregulation of KC-related genes is overstated. Changes in LYVE1 and ID1 are not statistically significant (Figure 2G), yet they are presented as meaningful. Clear separation of statistically significant results from non-significant trends is critical to avoid overinterpretation.

      We apologise for this, as it was never our intention to present these markers as significant, but rather we presented these markers because we thought that these markers would be of interest to the audience. We have clarified the text to reflect that these are trends and non-significant (Lines 367-369).

      (5) Mimicking In Vivo Clinical Responses:

      (5.1) The authors' conclusion that IL-6 responses are not recapitulated when iMACs are replaced by monocyte-derived macrophages (MoMs) is not fully supported by the data presented. In fact, the MoM co-cultures exhibit a noticeable trend toward increased IL-6 production (e.g., approximately 150% with LTG at 66.6 µM and 400 µM), suggesting that some degree of responsiveness is retained. To substantiate the claim that the observed cytokine modulation is unique to iKC-containing co-cultures, the authors should perform direct statistical comparisons of absolute IL-6 secretion levels between iKC and MoM co-cultures at each drug concentration. Such analyses are essential to determine whether the differences are statistically significant and biologically meaningful, and to clarify whether the observed effects truly reflect KC-specific functionality rather than general macrophage activation.

      (5.2) The effects of drug exposure on hepatocytes themselves are not addressed. It is important to evaluate whether the co-culture remains viable under treatment, whether it recovers after drug withdrawal, and whether there is evidence of cytotoxicity or irreversible phenotypic loss.

      (6) Interpretation of IL-6 Modulation and Model Specificity:

      The authors show that IL-6 secretion in their co-culture system varies in response to multiple hepatotoxic drugs and parallels some reported clinical trends - notably, a concentration-dependent decrease with diclofenac (DIC) and leflunomide (LFM). They further report that this pattern is not observed in hepatocyte-PBMC-derived macrophage co-cultures, and they conclude that iMAC/iKC-like cells are essential for capturing immune-mediated hepatotoxic responses. However, the data presented do not fully justify such a conclusion. Several key mechanistic issues weaken the interpretation:

      (6.1) Mechanistic ambiguity in the DIC response: The decrease in IL-6 following DIC exposure is most likely attributable to reduced prostaglandin E₂ (PGE₂) production via COX inhibition, which secondarily suppresses IL-6 signaling. This effect is a general pharmacological property of NSAIDs and is not necessarily reflective of Kupffer cell-specific pathways. Direct evidence - such as prostanoid quantification or PGE₂ rescue experiments - is required to establish that the observed effects are liver-specific rather than nonspecific NSAID responses.

      (6.2) Pharmacogenetic complexity in the LFM response: LFM-induced hepatotoxicity is highly variable and largely dependent on CYP2C9 polymorphisms, which determine conversion to the active metabolite teriflunomide. Because hepatotoxicity and the associated cytokine responses are not universal among patients, a simplified co-culture model lacking metabolic diversity cannot be assumed to faithfully reproduce patient-specific immune responses. The observed IL-6 suppression could arise from differences in metabolic activation, intracellular exposure, or indirect signaling changes rather than from intrinsic KC-specific mechanisms.

      These points significantly undermine the authors' claim that IL-6 modulation provides definitive evidence of model specificity or predictive value. At minimum, the manuscript should (i) explicitly acknowledge these mechanistic limitations, (ii) include supporting data such as prostanoid profiling, CYP2C9 modulation, or teriflunomide quantification, and (iii) temper its claims regarding the model's capacity to recapitulate immune-mediated hepatotoxicity. Without such evidence, the current interpretation risks overstating the functional significance and translational relevance of the co-culture system.

      We fully agree with the reviewer and have highlighted this in our discussion (Lines 540 – 551).

    1. eLife Assessment

      The analysis of neural morphology across Heliconiini butterfly species revealed brain area-specific changes associated with new foraging behaviours. While the volume of the centre for learning and memory, the mushroom bodies, was known to vary widely across species, these new, valuable results show conservation of the volume of a center for navigation, the central complex, but with specific changes in neuropeptide expression in the noduli and in the numbers of ellipsoid body ring neurons. The presented evidence is convincing for both volumetric conservation in the central complex and fine neuroanatomical differences associated with pollen feeding, delivered by experimental approaches that are applicable to other insect species. This work will be of interest to evolutionary biologists, entomologists, and neuroscientists.

    2. Reviewer #1 (Public review):

      The authors previously reported that Heliconius, one genus of the Heliconiini butterflies, evolved to be efficient foragers to feed pollen of specific plants and have massively expanded mushroom bodies. Using the same image dataset, the authors segmented the central complex and associated brain regions and found that the volume of the central complex relative to the rest of brain are largely conserved across the Heliconiini butterflies. By performing immunostaining to label specific subset of neurons, the authors found several potential sites of evolutional divergence in the central complex neural circuits, including the numbers of GABAergic ellipsoid body ring neurons and the innervation patterns of Allatostatin A expressing neurons in the noduli. These neuroanatomical data will be helpful to guide the future studies to understand the evolution of the neural circuits for vector-based navigations.

      Strength

      The authors used sufficiently large scale of dataset from 307 individuals of 41 specifies of Heliconiini butterflies to solidify the quantitative conclusions, and present new microscopy data for fine neuroanatomical comparison of the central complex.

      Weakness

      (1) Although the figures display a concise summary of anatomical findings, it would be difficult for non-experts to learn from this manuscript to identify the same neuronal processes in the raw confocal stacks. It would be helpful to have instructive movies to show step by step guide for identifications of neurons of interests, segmentations and 3D visualizations (rotation) for several examples including ER neurons (to supplement texts in line 347-353) and Allatostatin A neurons.

      (2) Related to (1), it was difficult for me to access if the data in Fig 7 support the author's conclusions that ER neuron number increased in Heliconius Melpomene. By my understanding, the resolution of this dataset isn't high enough to trace individual axons and therefore authors do not rule out that the portion of "ER ring neurons" in Heliconius may not innervate the ER, as stated in Line 635 "Importantly, we also found that some ER neurons bypass the ellipsoid body and give rise to dense branches within distinct layers in the fan-shaped body (ER-FB)". If they don't innervate the ellipsoid body, why are they named as "ER neurons"?

      (3) Discussions around the line 577-584 requires the assumption that each ellipsoid body (EB) ring neuron typically arborise in a single microglomerulus to form largely one-to-one connection with TuBu neurons within the bulb (BU), and therefore the number of BU microglomeruli should provide an estimation of the number of ER neurons. Explain this key assumption or provide an alternative explanation.

      (4) The details of antibody information are missing in the Key resource table. Instead of citing papers, list the catalogue numbers and identifier for commercially available antibodies, and describe the antigen and if they are monoclonal or polyclonal. Are antigens conserved across species?

      (5) I did not understand why authors assume that foraging to feed on pollens is more difficult cognitive task than foraging to feed on nectars. Would it be possible that they are equality demanding tasks but pollen feeding allows Heliconius to pass more proteins and nucleic acids to their offsprings and therefore they can develop larger mushroom bodies?

      Comments on revisions:

      The authors fully addressed my concerns and significantly improved the accessibility of the manuscript.

    3. Reviewer #2 (Public review):

      Summary

      In this study, Farnsworth et al. ask whether the previously established expansion of mushroom bodies in the pollen foraging Heliconius genus of Heliconiini butterflies co-evolved with adaptations in the central complex. Heliconius trap line foraging strategies to acquire pollen as a novel resource require advanced spatial memory mediated by larger mushroom bodies but the authors show that related navigation circuits in the central complex are highly conserved across the Heliconiini tribe, with a few interesting exceptions. Using general immunohistochemical stains and 3D reconstruction, the authors compared volumes of central complex regions and unlike the mushroom bodies, there was no evidence of expansion associated with pollen feeding. However, a second dataset of neuromodulator and neuropeptide antibody labeling reveal more subtle differences between pollen and non-pollen foragers and highlight sub-circuits that may mediate species-specific differences in behavior. Specifically, the authors found an expansion of GABAergic ER neurons projecting to the fan shaped body in Heliconius which may enhance their ability to path-integrate. They also found differences in Allatostatin A immunoreactivity, particularly increased expression in the noduli associated with pollen feeding. These differences warrant closer examination in future studies to determine their functional implication on navigation and foraging behaviors.

      Strengths

      The authors leveraged a large morphological data set from the Heliconiini to achieve excellent phylogenetic coverage across the tribe with 41 species represented. Their high quality histology resolves anatomical details to the level of specific, identifiable tracts and cell body clusters. They revealed differences at a circuit level, which would not be obvious from a volumetric comparison. The discussion of these adaptations in the context of central complex models is useful for generating new hypotheses for future studies on the function of ER-FB neurons and the role of Allatostatin A modulation in navigation.<br /> The conclusions drawn in this paper are measured and supported by rigorous statistics and evidence from micrographs.

      Weaknesses

      The majority of results in this study do not reveal adaptations in the central complex associated with pollen foraging. However, reporting conserved traits is useful and illustrates where developmental or functional constraints may be acting. The authors have now revised the introduction to set up two alternate hypotheses..

      In the main text, the authors describe differences in GABAergic ER neurons between H. melpomene and an outgroup species, with additional images from other species in Figure S4. Quantification of ER cells in these other species would strengthen the claim that these are increased in Heliconius and not just the focal species, but this may hopefully be pursued in future studies.

      Comments on revisions:

      I am satisfied with the authors' revisions.

    4. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      The analysis of neural morphology across Heliconiini butterfly species revealed brain area specific changes associated with new foraging behaviours. While the volume of the centre for learning and memory, the mushroom bodies, was known to vary widely across species, new, valuable results show conservation of the volume of a center for navigation, the central complex. The presented evidence is convincing for both volumetric conservation in the central complex and fine neuroanatomical differences associated with pollen feeding, delivered by experimental approaches that are applicable to other insect species. This work will be of interest to evolutionary biologists, entomologists, and neuroscientists.

      Many thanks for your assessment and time handling this manuscript. We value the constructive input of both reviewers and believe that the result is an improved publication.

      Public Reviews:

      Reviewer #1 (Public review):

      The authors previously reported that Heliconius, one genus of the Heliconiini butterflies, evolved to be efficient foragers to feed pollen of specific plants and have massively expanded mushroom bodies. Using the same image dataset, the authors segmented the central complex and associated brain regions and found that the volume of the central complex relative to the rest of the brain is largely conserved across the Heliconiini butterflies. By performing immunostaining to label a specific subset of neurons, the authors found several potential sites of evolutionary divergence in the central complex neural circuits, including the number of GABAergic ellipsoid body ring neurons and the innervation patterns of Allatostatin A expressing neurons in the noduli. These neuroanatomical data will be helpful to guide future studies to understand the evolution of the neural circuits for vector-based navigation.

      We thank Reviewer 1 for the constructive feedback and criticism, which will have strengthened this publication.

      Strengths:

      The authors used a sufficiently large scale of dataset from 307 individuals of 41 species of Heliconiini butterflies to solidify the quantitative conclusions and present new microscopy data for fine neuroanatomical comparison of the central complex.

      Weaknesses:

      (1) Although the figures display a concise summary of anatomical findings, it would be difficult for non-experts to learn from this manuscript to identify the same neuronal processes in the raw confocal stacks. It would be helpful to have instructive movies to show a step-by-step guide for identification of neurons of interest, segmentations, and 3D visualizations (rotation) for several examples, including ER neurons (to supplement texts in line 347-353) and Allatostatin A neurons.

      We approached this with the following logic:

      All 3D segmentations were animated, to illustrate how they are generated from raw imaging data. This means we are providing a video file for each major species group (Heliconius/outgroup-Heliconiini) for Figure 4 (general CX anatomy), Figure 7 (ER neuron projections), Figure S5 (ER neuron/bulb anatomy). This visual connection should help the reader relate 3D segmentations to image stacks. We have also added a reference to these videos in the relevant Figure captions.

      We also annotated image stacks, but did so selectively. We annotated key stacks of Figure 4 (general CX anatomy), Figure 7 (ER neuron projections), Figure S5 (ER neuron/bulb anatomy) and include a reference in figure caption to them.

      We refrained from annotating stacks of Figures 5, 6, 8 and S4. This is because we believe that the annotations we have performed in the figure panels will be sufficient for readers interested in the finer detail of these anatomies who are familiar with general CX anatomy.

      We believe that our approach will help the reader to gain a visual illustration of those parts of the manuscript which report key results and novel insights, such as ER neuronal variation, and that the data and figures collectively provide accessible information sufficient for this purpose.

      Text changes in Figure captions 4, 7 and S5: “See animated 3D segmentations and annotated stacks in file repository.”

      (2) Related to (1), it was difficult for me to assess if the data in Figure 7 support the author's conclusions that ER neuron number increased in Heliconius Melpomene. By my understanding, the resolution of this dataset isn't high enough to trace individual axons and therefore authors do not rule out that the portion of "ER ring neurons" in Heliconius may not innervate the ER, as stated in Line 635 "Importantly, we also found that some ER neurons bypass the ellipsoid body and give rise to dense branches within distinct layers in the fan-shaped body (ER-FB)". If they don't innervate the ellipsoid body, why are they named as "ER neurons"?

      Thanks for pointing to this. We believe this is primarily a nomenclature issue but have tried to specify in the text.

      Ultimately, neurons from this group that project to the EB forming the actual ring neurons and those that project to the FB with unclear function, thus far, emerge through the same lineage, DALv2 (as determined by Kandimalla et al 2023) and therefore have common developmental origin (also noted by Homberg et al 2018). To acknowledge their common developmental origin and to simplify nomenclature, and therefore also provide easier comprehension by non-experts, we specify which DALv2 progeny project to which areas, but refer to both adult neuron populations to “ER neurons”. We have changed the following text to acknowledge our definition specifically, which we hope mitigates the understandable confusion.

      Lines 354-357: “Here, we refer to these neurons, as well as those neurons projecting to the fan-shaped body (GU neurons in [66]), as ER neurons due to their common developmental origin [45,66] and to simplify anatomical descriptions.”

      Lines 386-387: “Whether these ER neurons solely branch in the fan-shaped body, as shown for GU neurons elsewhere [66] or have additional side branches entering the ellipsoid body is not clear.”

      (3) Discussions around the lines 577-584 require the assumption that each ellipsoid body (EB) ring neuron typically arborises in a single microglomerulus to form a largely one-to-one connection with TuBu neurons within the bulb (BU), and therefore, the number of BU microglomeruli should provide an estimation of the number of ER neurons. Explain this key assumption or provide an alternative explanation.

      Thanks for this. We do not think that our hypothesis necessarily requires any specific assumptions regarding the ratio of microglomerulus to ER or TuBu neurons. Even in Drosophila the ratio of ER to MG is only approximately 1:1, as some microglomeruli seem to combine into one. In other species this relationship might be very different. Indeed, our data suggests that in outgroup-Heliconiini the ratio is 4.4 microglomeruli to 1 ER neuron, and in Heliconius it is 3.4. However, as these MG numbers are extrapolated and cannot be precisely counted, they may be too imprecise to come to a definite conclusion, hence why we do not mention this in the text. Importantly, extrapolation in the current form is a valid additional way for us to describe overall bulb anatomy (next to bulb volume, average microglomerulus size).

      In any case, the inference we make here is that a conserved bulb anatomy in volume, MG numbers and size supports our assumption that the additional neurons in the ER neuron group/DALv2 progeny do not arborize in the bulb, but do so in the SMP/SLP region and in the fanshaped body. We believe we have described this inference accurately in the current manuscript.

      An additional point, not mentioned in the manuscript, but emerging through lineage annotations of connectome data, is that some DALv2 progeny have been identified as MBONs as well as being GABA-ergic, which could potentially be the ER-FB neurons that we describe (Schlegel et al 2024 Nature). We refrain from mentioning this here, as its too speculatory, but we thought the reviewer may be interested in this observation.

      (4) The details of antibody information are missing in the Key resource table. Instead of citing papers, list the catalogue numbers and identifier for commercially available antibodies, and describe the antigen, and whether they are monoclonal or polyclonal. Are antigens conserved across species?

      We have now added substantial information to Table 2, including research resource identifiers (RRIDs) and antigen descriptions, as well as information about specificity and conservation. In the text itself, in line 757, we already provide publications that have illustrated conservation very extensively.

      We believe that with the additional information provided in Table 2, all necessary information is now provided.

      (5) I did not understand why authors assume that foraging to feed on pollens is a more difficult cognitive task than foraging to feed on nectar. Would it be possible that they are equally demanding tasks, but pollen feeding allows Heliconius to pass more proteins and nucleic acids to their offspring and therefore they can develop larger mushroom bodies?

      This is an excellent point. Our current understanding is that pollen feeding is a cognitively more demanding task, because, a) the density of pollen resources is lower than nectar resources, and b) the competition for pollen is higher (pollen is depleted quickly, and Heliconius compete with each other, and other taxa including hummingbirds). There is therefore a benefit to high foraging efficiency, which favours the evolution of learning. This is likely reinforced by the long lives of Heliconius which live up to a year, compared to ~4 weeks for most outgroups and the temporal stability of major pollen resources, resulting in a memorised location providing benefit for the long periods of time (Young and Montgomery 2020 Proc B).

      We now refer to an additional publication (Young and Montgomery 2020 Proc B) in lines 103-104 for a fuller description of the ecology of pollen feeding, and in the current manuscript simply focus on the impact of mushroom body expansion on the CX.

      Reviewer #2 (Public review):

      Summary:

      In this study, Farnsworth et al. ask whether the previously established expansion of mushroom bodies in the pollen foraging Heliconius genus of Heliconiini butterflies co-evolved with adaptations in the central complex. Heliconius trap line foraging strategies to acquire pollen as a novel resource require advanced spatial memory mediated by larger mushroom bodies, but the authors show that related navigation circuits in the central complex are highly conserved across the Heliconiini tribe, with a few interesting exceptions. Using general immunohistochemical stains and 3D reconstruction, the authors compared volumes of central complex regions, and unlike the mushroom bodies, there was no evidence of expansion associated with pollen feeding. However, a second dataset of neuromodulator and neuropeptide antibody labeling reveals more subtle differences between pollen and non-pollen foragers and highlights sub-circuits that may mediate species-specific differences in behavior. Specifically, the authors found an expansion of GABAergic ER neurons projecting to the fanshaped body in Heliconius, which may enhance their ability to path-integrate. They also found differences in Allatostatin A immunoreactivity, particularly increased expression in the noduli associated with pollen feeding. These differences warrant closer examination in future studies to determine their functional implication on navigation and foraging behaviors.

      We thank Reviewer 2 for the constructive and thorough review. We believe that addressing these criticisms will have improved this publication.

      Strengths:

      The authors leveraged a large morphological data set from the Heliconiini to achieve excellent phylogenetic coverage across the tribe with 41 species represented. Their high-quality histology resolves anatomical details to the level of specific, identifiable tracts and cell body clusters. They revealed differences at a circuit level, which would not be obvious from a volumetric comparison. The discussion of these adaptations in the context of central complex models is useful for generating new hypotheses for future studies on the function of ER-FB neurons and the role of Allatostatin A modulation in navigation.

      The conclusions drawn in this paper are measured and supported by rigorous statistics and evidence from micrographs.

      Weaknesses:

      The majority of results in this study do not reveal adaptations in the central complex associated with pollen foraging. However, reporting conserved traits is useful and illustrates where developmental or functional constraints may be acting. The implied hypothesis in the introduction is that expansion of mushroom bodies in Heliconius co-evolved with central complex adaptations, so it may be helpful to set up the alternate hypotheses in the beginning.

      Thank you for this relevant comment. We have added to the text in lines 124-128, as follows

      “Indeed, these circumstances permit us to test the hypotheses that modifications in the mushroom bodies either occurred in isolation from other integrative centres, or that they occurred in concert with specific changes in centres, such as the central complex. This provides insights into the functional flexibility of two interacting, integrative centres across evolutionary time.”

      In the main text, the authors describe differences in GABAergic neurons "across several species" but only one Heliconius and one outgroup species seem to be represented in the figures. ER numbers in Figure 7H are only compared for these two species. If this data is available for other species, it would strengthen the paper to add them to the analysis, since this was one of the most intriguing findings in the study. I would want to know if the increased ER number is a trend in Heliconius or specific to H. melpomene.

      This points to imprecise phrasing. We indeed have additional data in other species, but unfortunately not to an extent that would permit quantification of cell numbers, which is why we chose to put these data into the supplement, Fig. S4.

      We modified the text to more directly point at the additional data in Fig S4, now reading in lines 362-368

      “…, we noticed a pronounced difference in a portion of projections leading into the fan-shaped body and a strong difference in signal inside layer III in our two focal species H. Melpomene and D. iulia, as well as other representatives of the Heliconiini tribe (Figure S4A-B, Figure 7). To understand how these differences could have occurred, we quantified ER neuron numbers in our focal species, and identified a significant difference, reflecting a 35% increase in Heliconius (t = 4.221, P = 0.004; Figure 7H).”

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) Add a detailed description about each of the tiff files that were deposited at https://doi.org/10.5281/zenodo.15304965. It was hard for me to relate these raw images with the Figure panels. For instance, "Melp_GAD_26-F_detailed_conc.tif" in the Figure 7 folder seems to be used to make Figure 7L and N, but that information is cryptic.

      We agree with the reviewer. We added further descriptions, and have created a detailed readme file which explains which original file refers to which figure. Together with the efforts for Reviewer 1’s first comment, we hope that this updated version of our repository is easier to understand.

      In addition, we made additional changes in image orientation in some of the files supplied, and which were originally incorrect.

      (2) Add descriptions about the dataset for large-scale volumetric analysis. With the current methods and texts, it is hard to understand what kinds of staining and microscopes were used. I initially thought that they could be micro-CT data.

      We have made two improvements:

      We have added an additional readme file to explain the different datasets, and which datasets were used for each figure, to relate them to the original data deposited at zenodo.org (see your previous comment).

      We have added descriptions in several places in the manuscript file, i.e.

      Lines 133-135, now reading “To assess evidence of volumetric changes in the central complex and associated neuropils, we drew data from a large dataset of immunostained brains from 307 individuals of 41 species, …”

      Lines 144-149, now reading “We used a combination of phylogenetic comparative analysis across a large dataset of brains immunostained against the structural marker synapsin in 41 species and 307 individuals, and more targeted sampling of species that represent the behavioural and neuroanatomical diversity of Heliconiini for more fine-scale assessments of patterns of divergence in substructures of the CX with various antibodies (Figure 1A-B).”

      (3) Line 275: Non-expert readers would need an explanation about what the gamma lobe is.

      Agreed and added in line 273

      “Some of the ventral projections seemed to directly originate from the γ lobe, a portion of the mushroom body, thus potentially labelling projections of mushroom body output neurons into the fan-shaped body (Figure 5a-c) [12,21].”

      (4) Figures 4 I-L are missing.

      We modified the figure caption accordingly, and address annotated differences more directly. This section now reads

      “G/H: Labelling reveals two distinguishable layers in the fan-shaped body while additional staining elsewhere reveals further detail (arrows in G/H-2/3). Thicker tract conflations indicate the columnar architecture determined through the four columnar neuron bundles (arrowheads in G/H-3). Labelling in the EB reveals two pronounced layers (arrows in G/H-1/2), while obvious columns could not be indicated. PB protocerebral bridge, FB fan-shaped body, EB ellipsoid body. A anterior, P posterior. Scale bars are 50 μm.”

      (5) In the current version of Figure 1B, AOTU is displayed with the mushroom body. The authors can emphasize its relation to the central complex by showing it on the right side of panels together with the central complex.

      Great suggestion. We have done this now. We have kept the AOTU at the scale of the MB, indicated by the different scale bars of the bottom of the figure, as we’re showing the CX at a slightly larger scale.

      (6) Figure 1C: What do the colors of the lines represent?

      We now changed these colours so that they correspond to the colours chosen in Figures 2 and S2 as well as in a previous publication of the lab, added an asterisk next to Heliconius aoede, and added text to the figure legend:

      “Colour indicates focal groups here and elsewhere [29]. The asterisk at the branch of H. aoede indicates a secondary loss of pollen feeding.”

      (7) Figures 2A and B: What does the size of the circles represent? I guess that small ones are individuals, and larger ones are species averages. Plots with only species averages would be easier to see. It is difficult to distinguish Heliconius and Helicononius aoede in these panels. It would be easier if Heliconius circles were outlined with thin black lines. 

      Thanks for this. We wanted to keep both the averages and individual data points in one figure, as to not overcrowd the manuscript with additional figures. We still hope that the changes we made address the confusion sufficiently. We made the following modifications to Figure 2 and S1 and S2:

      (1) Added text in the figure legend clarifying what solid and transparent circles indicate (“Solid data points indicate species averages, while opaque circles indicate individual data points.”)

      (2) Added, as suggested, additional contours, to all Heliconius data points, and added corresponding text to the legend (“Black contours indicate Heliconius sp. data points.”)

      (3) Changed opacity settings of individual data points.

      Reviewer #2 (Recommendations for the authors):

      (1) Line 391 and Methods. It was unclear how the extrapolated microglomeruli numbers were calculated. Please clarify this in the methods.

      Agreed. We substantially modified the text to address this.

      Lines 392-396: “We generated high resolution images of the bulb to determine its size (Figure S5 C-F), and 3D segmented seven microglomeruli per individual with which we generated an extrapolated approximation of total microglomeruli number by dividing bulb volume with average microglomerulus volume. This was necessary as most microglomeruli were not discernible from each other (Figure S5 G-H).”

      Lines 862-873: “To segment the bulb, we created high resolution images and were particularly careful to only segment the area of the bulb that comprised large synapses/glomeruli, excluding parts of the LEa/IT projection. This was essential, because we relied on extrapolating the total number of microglomeruli from a subset of segmented microglomeruli and the total volume that contained microglomeruli, which means any section containing tracts and not glomerular structures would skew the estimated total number of microglomeruli. Extrapolation was necessary, as not all microglomeruli were visually discernible. We achieved an unskewed bulb volume by leaving out dense pieces of tubulin-positive tract material. We segmented seven microglomeruli per individual from the posterior section of the bulb, where they were most clearly visible, to get the most comparable impression across individuals and species. We then calculated average microglomerulus size and divided this by bulb volume to determine an approximation of microglomeruli number.”

      (2) Line 439. It would be helpful to add that Kaiser et al. studied honeybees.

      Agreed! Now reads in lines 443-444

      “Moreover, Kaiser et al. [75] identified Allatostatin A expression in three fan-shaped and two ellipsoid body layers in the honey bee brain, …”

      (3) Line 492. "outcome" should be "outcomes".

      We believe that this refers to original line 481. Corrected. Thank you.

      (4) Figure 3B. If there is significance to the colors and triangle directions, please include a key/legend.

      We have added:

      “Cell type depictions are examples with localisation inside each neuropil being purely visual (as well as their colour), while triangles indicate approximate output sites.”

      We also corrected the following issues that were noted during our revisions:

      line 587, wrong reference.

      We updated references 37 and 44, which are now respectively

      Hodge, E. A. et al. Modality-specific long-term memory enhancement in Heliconius butterflies. Philos Trans R Soc Lond B Biol Sci 380, 20240119 (2025).

      Hodge, E. A. et al. Conservation of sensory pathways implies a localised change in the mushroom bodies is associated with cognitive evolution in Heliconius butterflies. Evol qpag005 (2026) doi:10.1093/evolut/qpag005.

      Figure S5 had an error in panels C and D, where the pictures in C were actually for H. Melpomene in D and the reverse; the other panels were correct. We have corrected this.

      In the data submitted on Zenodo: we corrected a few inconsistencies in channel colours and orientation in the .tiff files for Fig 6, 8 and S4.

      We added important bulb 3D segmentation files to the repository on Zenodo.

    1. eLife Assessment

      This valuable study introduces miRTarDS, a novel computational framework that predicts microRNA-target interactions based on a publicly available pretrained Sentence-BERT language model and downstream classification analysis. The strength of the evidence is incomplete, as the evaluation framework relies on unreliable ground-truth and false sets. Furthermore, the analysis fails to compare miRTarDS against existing state-of-the-art biomedical language models.

    2. Reviewer #1 (Public review):

      The author presents a new method for microRNA target prediction based on (1) a publicly available pretrained Sentence-BERT language model that the author fine-tunes using MeSH information and (2) downstream classification analysis for microRNA target prediction. In particular, the author's approach, named "miRTarDS", attempts to solve the microRNA target prediction problem by utilizing disease information (i.e., semantic similarity scores) from their language model. The author then compares the prediction performance with other sequence- and disease-based methods and attempts to show that miRTarDS is superior or at least comparable to existing methods. The author's general approach to this microRNA target prediction problem seems promising, but fails to demonstrate concrete computational evidence that miRTarDS outperforms other existing methods. The author's claim that disease information-based language models are sufficient is unfounded. The manuscript requires substantial rewriting and reorganization for readers with a strong background in biomedical research.

      A major issue related to the author's claim of computational advance of miRTarDS: The author does not introduce existing biomedical-specific language models, and does not compare them against miRTarDS's fine-tuned model. The performance of miRTarDS is largely dependent on the semantic embedding of disease terms. The author shows in Figure 5 that MeSH-based fine-tuning leads to a substantial improvement in MeSH-based correlation compared to the publicly available pretrained SBERT model "multi-qa-MiniLM-L6-cos-v1" without sacrificing a large amount of BIOSSES-based correlation. However, the author does not compare the performance of MeSH- and BIOSSES-based correlation with existing language models such as ChatGPT, BioBERT, PubMedBERT, and more. Also, the substantial improvement in MeSH-based correlation is a mere indication that the MeSH-based fine-tuning strategy was reasonable and not that it's superior to the publicly available pretrained SBERT model "multi-qa-MiniLM-L6-cos-v1".

      Another major issue is in the author's claim that disease-information from miRTarDS's language model is "sufficient" for accurate microRNA target prediction. Available microRNA targets with experimental evidence are largely biased for those with disease implications that have been reported in the biomedical literature. It's possible that their language model is biased by existing literature that has also been used to build microRNA target databases. Therefore, it is important that the author provides strong evidence that excludes the possibility of data leakage circularity. Similar concerns are prevalent across the manuscript, and so I highly recommend that the author reassess the evaluation frameworks and account for inflated performance, biased conclusions, and self-confirming results.

      Last but not least, the manuscript requires a deeper and careful description and computational encoding of microRNA biology. I'd advise the author to include an expert in microRNA biology to improve the quality of this manuscript. For example, the author uses the pre-miRNA notation and replaces the mature miRNA notation to maintain computational encoding consistency across databases. However, the mature microRNA notation "the '-3p' or '-5p' is critical as the 3p and 5p mature microRNAs have different seed sequences and thus different mRNA targets. The 3p mature microRNA would most likely not target an mRNA targeted by the 5p mature microRNA.

    3. Reviewer #2 (Public review):

      Summary:

      This study introduces a novel knowledge-driven approach, miRTarDS, which enables microRNA-Target Interaction (MTI) prediction by leveraging the disease association degree between a miRNA and its target gene. The core hypothesis is that this single feature is sufficient to distinguish experimentally validated functional MTIs from computationally predicted MTIs in a binary classification setting. To quantify the disease association, the authors fine-tuned a Sentence-BERT (SBERT) model to generate embeddings of disease descriptions and compute their semantic similarity. Using only this disease association feature, miRTarDS achieved an F1 score of 0.88 on the test set.

      Strengths:

      The primary strength is the innovative use of the disease association degree as an independent feature for MTI classification. In addition, this study successfully adapts and fine-tunes the Sentence-BERT (SBERT) model to quantify the semantic similarity between biomedical texts (disease descriptions). This approach establishes a critical pathway for integrating powerful language models and the vast growth in clinical/disease data into biochemical discovery, like MTI prediction.

      Weaknesses:

      The main weakness lies in its definition of the ground-truth dataset, which serves as a foundation for methodological evaluation. The study defines the Negative Set as computationally predicted MTIs that lack experimental evidence. However, the absence of experimental validation does not equate to non-functionality. Similarly, the miRAW sets are classified by whether the target and miRNA could form a stable duplex structure according to RNA structure prediction. This definition is biologically irrelevant, as duplex stability does not fully encapsulate the complex in vivo binding of miRNAs within the AGO protein complex.

    4. Author response:

      We would like to express our sincere gratitude to the editors and the two reviewers for providing their constructive and valuable comments that will greatly guide us in improving the manuscript. We will revise the manuscript according to their critiques and suggestions. The existing code for this study, along with preliminary code developed in response to the review comments, has been made publicly available at https://github.com/cbaiming/miRTarDS. We now provide detailed responses to each reviewer below.

      Reviewer #1 (Public review):

      The author presents a new method for microRNA target prediction based on (1) a publicly available pretrained Sentence-BERT language model that the author fine-tunes using MeSH information and (2) downstream classification analysis for microRNA target prediction. In particular, the author's approach, named "miRTarDS", attempts to solve the microRNA target prediction problem by utilizing disease information (i.e., semantic similarity scores) from their language model. The author then compares the prediction performance with other sequence- and disease-based methods and attempts to show that miRTarDS is superior or at least comparable to existing methods. The author's general approach to this microRNA target prediction problem seems promising, but fails to demonstrate concrete computational evidence that miRTarDS outperforms other existing methods. The author's claim that disease information-based language models are sufficient is unfounded. The manuscript requires substantial rewriting and reorganization for readers with a strong background in biomedical research.

      We appreciate the reviewer’s careful examination of modeling, benchmarking, and interpretation, and we are particularly encouraged that they found the proposed method promising. We will make corresponding revisions to the manuscript based on the reviewer’s comments.

      A major issue related to the author's claim of computational advance of miRTarDS: The author does not introduce existing biomedical-specific language models, and does not compare them against miRTarDS's fine-tuned model. The performance of miRTarDS is largely dependent on the semantic embedding of disease terms. The author shows in Figure 5 that MeSH-based fine-tuning leads to a substantial improvement in MeSH-based correlation compared to the publicly available pretrained SBERT model "multi-qa-MiniLM-L6-cos-v1" without sacrificing a large amount of BIOSSES-based correlation. However, the author does not compare the performance of MeSH- and BIOSSES-based correlation with existing language models such as ChatGPT, BioBERT, PubMedBERT, and more. Also, the substantial improvement in MeSH-based correlation is a mere indication that the MeSH-based fine-tuning strategy was reasonable and not that it's superior to the publicly available pretrained SBERT model "multi-qa-MiniLM-L6-cos-v1".

      We thank the reviewer for the constructive suggestions regarding the benchmarking of language models. We acknowledge that the performance of miRTarDS largely depends on the semantic embeddings of disease terms. So, in the revisions, I will: 1) conduct a literature review to introduce existing biomedical-specific language models, and 2) perform a horizontal comparison between our fine-tuned model and these existing models, to more comprehensively evaluate the model’s capabilities.

      Another major issue is in the author's claim that disease-information from miRTarDS's language model is "sufficient" for accurate microRNA target prediction. Available microRNA targets with experimental evidence are largely biased for those with disease implications that have been reported in the biomedical literature. It's possible that their language model is biased by existing literature that has also been used to build microRNA target databases. Therefore, it is important that the author provides strong evidence that excludes the possibility of data leakage circularity. Similar concerns are prevalent across the manuscript, and so I highly recommend that the author reassess the evaluation frameworks and account for inflated performance, biased conclusions, and self-confirming results.

      We thank the reviewer for the comment. We recognize that existing experimentally validated microRNA targets may be biased toward those reported in biomedical literature as disease‑related. To mitigate this bias, we attempted to extract predicted microRNA targets that share a very similar number of miRNA- and gene‑ disease entries as the experimentally validated microRNA targets using the K‑Nearest Neighbors (KNN) method. Then applied Positive‑Unlabeled (PU) Learning to classify the two groups. PU‑Learning is designed to address scenarios where only a subset of the training data is explicitly labeled as positive, while the remaining data are unlabeled—with the unlabeled set containing both potential positives and true negatives—which is highly suitable for the application context of this manuscript [1]. Preliminary results show that after applying the new data extraction and classification approach, model performance drops to around F1=0.73 (the MISIM method also shows a decline, with F1 around 0.58; detailed code is available on GitHub). The specific reasons for this require further investigation.

      Last but not least, the manuscript requires a deeper and careful description and computational encoding of microRNA biology. I'd advise the author to include an expert in microRNA biology to improve the quality of this manuscript. For example, the author uses the pre-miRNA notation and replaces the mature miRNA notation to maintain computational encoding consistency across databases. However, the mature microRNA notation "the '-3p' or '-5p' is critical as the 3p and 5p mature microRNAs have different seed sequences and thus different mRNA targets. The 3p mature microRNA would most likely not target an mRNA targeted by the 5p mature microRNA.

      We thank the reviewer for the critique and suggestion. We fully agree with the reviewer that the distinction between the 3p and 5p mature strands is critical for determining mRNA targeting, as they possess distinct seed sequences. In our study, we relied on the miRNA–disease associations provided by the HMDD database, which annotates interactions at the pre-miRNA level: “… the enriched functions of each mature miRNA are aggregated to the corresponding miRNA precursor.” [2] Furthermore, existing literature suggests that the pre-miRNA level can be appropriate and informative for disease association analyses: “Compared with the mature miRNA method, the pre-miRNA method is more useful for studying disease association.” [3] We also find that, in some cases, both strands cooperate to regulate the same or complementary pathways [4]. We acknowledge the reviewer’s point as an important consideration for future revision. We plan to consult or collaborate with biologists to enhance the quality of the manuscript in biology.

      Reviewer #2 (Public review):

      This study introduces a novel knowledge-driven approach, miRTarDS, which enables microRNA-Target Interaction (MTI) prediction by leveraging the disease association degree between a miRNA and its target gene. The core hypothesis is that this single feature is sufficient to distinguish experimentally validated functional MTIs from computationally predicted MTIs in a binary classification setting. To quantify the disease association, the authors fine-tuned a Sentence-BERT (SBERT) model to generate embeddings of disease descriptions and compute their semantic similarity. Using only this disease association feature, miRTarDS achieved an F1 score of 0.88 on the test set.

      We thank the reviewers for their positive feedback, especially for their recognition of the novelty of this manuscript.

      Strengths:

      The primary strength is the innovative use of the disease association degree as an independent feature for MTI classification. In addition, this study successfully adapts and fine-tunes the Sentence-BERT (SBERT) model to quantify the semantic similarity between biomedical texts (disease descriptions). This approach establishes a critical pathway for integrating powerful language models and the vast growth in clinical/disease data into biochemical discovery, like MTI prediction.

      We would like to thank the reviewer again for their positive feedback. We appreciate their recognition of the novelty of our work, as well as their acknowledgment that the proposed method paves the way for integrating language models with clinical/disease data into biochemical discovery.

      Weaknesses:

      The main weakness lies in its definition of the ground-truth dataset, which serves as a foundation for methodological evaluation. The study defines the Negative Set as computationally predicted MTIs that lack experimental evidence. However, the absence of experimental validation does not equate to non-functionality. Similarly, the miRAW sets are classified by whether the target and miRNA could form a stable duplex structure according to RNA structure prediction. This definition is biologically irrelevant, as duplex stability does not fully encapsulate the complex in vivo binding of miRNAs within the AGO protein complex.

      We thank the reviewers for their constructive feedback. We have realized that treating predicted MTI as a negative class may pose some issues. Therefore, we have decided to adopt Positive Unlabeled (PU) Learning in subsequent updates. This classification method can be applied to datasets such as ours, which contain only positive classes and lack negative ones [1]. We used the miRAW dataset to enable a horizontal comparison of our method with traditional sequence-based prediction approaches. We acknowledge that miRAW may overlook some biological insights, and we plan to optimize the construction of test datasets in the future. Some preliminary explorations have already been conducted, and the relevant code is available on GitHub.

      Furthermore, we will make the following revisions: 1) We will clearly specify the version of miRBase and incorporate more miRNA-related databases. 2) Conduct a further literature review on miRNA biological mechanisms to enhance the quality of the manuscript in biology. 3) Perform a more comprehensive evaluation of the model’s performance. 4) Attempt to identify some representative MTIs that have been overlooked by existing prediction tools but can be predicted by our proposed method.

      References

      (1) Li, F., Dong, S., Leier, A., Han, M., Guo, X., Xu, J., ... & Song, J. (2022). Positive-unlabeled learning in bioinformatics and computational biology: a brief review. Briefings in Bioinformatics, 23(1), bbab461.

      (2) Huang, Z., Shi, J., Gao, Y., Cui, C., Zhang, S., Li, J., ... & Cui, Q. (2019). HMDD v3. 0: a database for experimentally supported human microRNA–disease associations. Nucleic acids research, 47(D1), D1013-D1017.

      (3) Wang, H., & Ho, C. (2023). The human pre-miRNA distance distribution for exploring disease association. International Journal of Molecular Sciences, 24(2), 1009.

      (4) Mitra, R., Adams, C. M., Jiang, W., Greenawalt, E., & Eischen, C. M. (2020). Pan-cancer analysis reveals cooperativity of both strands of microRNA that regulate tumorigenesis and patient survival. Nature Communications, 11(1), 968.

    1. eLife Assessment

      This is an important paper that reports in vivo physiological abnormalities in the hippocampus of a rat model of traumatic brain injury (TBI). In this study, authors focused on changes in theta-gamma phase coupling and action potential entrainment to theta, phenomena hypothesized to be critical for cognition. The authors provide convincing evidence of deficits in both features post-TBI and contributes new understanding to how disruptions in oscillatory coordination and spike timing may relate to cognitive impairment.

    2. Reviewer #1 (Public review):

      Summary:

      This study examines how traumatic brain injury (TBI) alters hippocampal network dynamics and single-unit activity in awake, behaving rats. Using laminar recordings, the authors report reductions in theta power, theta-gamma phase-amplitude coupling, and spike-field entrainment, alongside impairments in spatial memory performance.

      Strengths of the study include the use of high-density laminar electrodes to localize activity across hippocampal layers and the integration of electrophysiological and behavioral measures. Analyses that consider behavioral state and account for broadband power changes improve confidence in the interpretation of oscillatory effects. Additional controls suggest that the observed differences are unlikely to be explained by gross motor or motivational deficits. The reported relationships between theta amplitude, phase-amplitude coupling, and spike entrainment provide useful insight into how network coordination is disrupted following injury.

      There are a few minor weaknesses. The analyses of single-unit activity across environments are relatively limited, and more comprehensive approaches to characterizing spatial coding would strengthen conclusions about how TBI impacts hippocampal representations. The behavioral assessment relies primarily on a single task, which constrains the interpretation of the cognitive deficits. In addition, the relatively small number of animals is a limitation, although this is partially mitigated by the number of recorded units and the consistency of effects across measures.

      Overall, this work provides a careful characterization of hippocampal circuit dysfunction following TBI and contributes to understanding how disruptions in oscillatory coordination and spike timing may relate to cognitive impairment.

      Comments on revisions:

      The authors have adequately addressed all of my concerns.

    3. Reviewer #3 (Public review):

      Summary:

      In this study, authors studied the effects of traumatic brain injury created by LFPI procedure on the CA1 at network level. The major findings in this study seem to be that the TBI reduces theta and gamma powers in CA1, reduces phase amplitude coupling in between theta and gamma bands as well as disrupts the gamma entrainment of interneurons. I think the authors have made some important discoveries that could help advance the understanding of TBI effects at physiological level, however, more investigations into deciphering the relationship of the behavioral and brain states to the observed effects would help clarify the interpretations for the readers.

      Strengths:

      The authors in this study were able to combine behavioral verification of the TBI model with the laminar electrophysiological recordings of CA1 region to bring forward network level anomalies such as the temporal coordination of network level oscillations as well as in the firing of the interneurons. Indeed, it seems that the findings may serve future studies to functionally better understand and/or refine the therapies for the TBI.

      Weaknesses:

      Discoveries made in the paper and their broad interpretations can be helped with further characterization and comparison among the brain and behavioral states both during immobility and movement. The impact of brain injury in several parts of the brain can alter brain wide LFP and/or behavior. The altered behavior and/or LFP patterns might then lead to reduced spiking and unreliable LFP oscillations in the hippocampus. Hence, claims made in abstract such as "These results reveal deficits in information encoding and retrieval schemes essential to cognition that likely underlie TBI-associated learning and memory impairments, and elucidate potential targets for future neuromodulation therapies" does not have enough evidence in testing whether the disruptions were information encoding and retrieval related or due to sensory-motor and/or behavioral deficits that could also occur during TBI.

      Movement velocity is already known to be correlated to the entrainment of spikes with the theta rhythm and also in some cases with the gamma oscillations. So, it is of importance to disentangle the differences in behavioral variables and the observed effects. As an example, the author's claims of disrupted temporal coding (as shown in the graphical abstract) might have suffered from these confounds. The observed results of reduced entrainment might on one hand be due to the decreased LFP power (induced by injury in different brain areas) resulting in altered behavior and/or the unreliable oscillations of the LFP bands such as theta and gamma, rather than memory encoding and retrieval related disruption of spikes synchrony to the rhythms, while on the other hand they may simply be due to reduced excitability in the neurons particularly in the behavioral and brain state in which the effects were observed, rather than disrupted temporal code. Hence, further investigations into dissociating these factors could help readers mechanistically understand the interesting results observed by the authors.

      Comments on revisions:

      The authors have substantially improved the manuscript in response to the previous reviews. In particular, the revisions addressing the issue of behavioral deficits that could be caused due to the TBI, which were surprisingly not present (if anything minimal) in the injured rats, have strengthened the study and improved the support for the main conclusions. Overall, the manuscript is now clearer and more rigorous. Authors have also addressed all the minor points raised in the study. As a result, the study is now solid, with the major findings broadly supported by the data.

    4. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This is an important paper that reports in vivo physiological abnormalities in the hippocampus of a rat model of traumatic brain injury (TBI). In this study, authors focused on changes in theta-gamma phase coupling and action potential entrainment to theta, phenomena hypothesized to be critical for cognition. While the authors provide solid evidence of deficits in both features post-TBI, the study would have been stronger with a more hypothesis-driven approach and consideration of alterations of the animal's behavioral state or sensorimotor deficits beyond memory processes.

      We would like to thank the reviewers for their comments on our manuscript. By incorporating their feedback, we were able to make our hypotheses more clear, expand our analyses to compare physiological processes across similar behavioral states, and address extra hippocampal input and potential sensorimotor confounds in our data.

      Specifically, we have added new data in Figure 5 showing how theta amplitude correlates with theta-gamma PAC and entrainment strength. We have also added supplementary Figure 1 demonstrating that there are no differences in exploration or movement velocity in injured animals compared to shams. Supplementary Figures 2, 3, and 4 were added to compare oscillatory power while animals were still, moving at a higher velocity, and following a broadband power shift correction respectively. We also added Supplementary Figure 7 demonstrating that there were no differences in firing rates between sham and injured animals while they were still or moving and Supplementary Figure 8 showing no changes in pyramidal cell bursting. Finally, we added Supplementary Figure 10 showing that there was no difference in velocity or distance traveled during testing in the MWM between sham and injured animals and that learning curves were similar across groups before sham/injury surgery. We believe that the addition of this data significantly improves our manuscript by more strongly controlling for the animal’s behavioral state in our analyses and provides strong evidence that significant sensory/motor deficits were not present in injured animals at this injury level and time point post injury. Below we address specific points raised by the reviewers.

      Reviewer #1 (Public review):

      Summary:

      This study investigated how traumatic brain injury affects oscillatory and single-unit hippocampal activity in awake-behaving rats.

      Strengths:

      The use of high-density laminar electrodes enabled precise localization of recording sites. To ensure an unbiased, rigorous approach, single-unit analysis was performed by a reviewer who was blind to experimental conditions. A proof of concept study was undertaken to characterize the pathology that resulted from the specific TBI model used in the main study. There was an effort to link abnormalities in hippocampal activity to memory disruption by running a cohort of rats on the Morris Water Maze task.

      Weaknesses:

      The paper is written as if the experiment was exploratory and not hypothesis-driven despite the fact that there is a wealth of experimental evidence about this TBI model that could have informed very specific predictions to test a hypothesis that is only hinted at in the discussion. The number of rats used for the spatial working memory experiment is not reported. Some of the statistics are not completely reported. It is also unclear what the rationale was for recording single units in a novel and familiar environment. Furthermore, this analysis comparing single-unit activity between familiar and novel environments is quite rudimentary. There are much more rigorous analyses to answer the question of how hippocampal single-unit firing patterns differ across changes in environments. There are details lacking about the number of units recorded per session and per rat, all of which are usually reported in studies that record single units. Spatial working memory assessment is delegated to a single panel of a supplementary figure. More importantly, there is no effort to dissociate between spatial working memory deficits and other motor, motivational, or sensory deficits that could have been driving the lower "memory score" in the experimental group.

      In order to address these important concerns, we have made the following changes:

      (1) We have updated the results section to include more rationale for the recordings and analyses used to clarify our hypotheses. In addition, we hope that our extensive characterization will lay the groundwork to inform future studies investigating circuit-specific disruptions following TBI and neuromodulatory therapies.

      (2) The number of rats used for the spatial working memory experiment is reported in the text and figure legend.

      (3) We have added supplemental Table 2 to include the requested statistical information (t-statistic, degrees of freedom, and 1 vs 2-tailed analyses).

      (4) Unfortunately, we did not have adequate occupancy to robustly extract and compare place cell properties across groups and environments which obscured the rationale of our study design and limited us to more rudimentary analyses. While animals did actively explore the two environments, the relatively short recording time limited the spatial sampling of the two-dimensional environment. We were able to extract putative place cells and found some evidence that place cells in TBI rats had lower spatial information content than in shams (as has previously been described). However, we did not feel that place cell analyses were rigorous enough to include in this manuscript due to the limited spatial sampling. Future studies in the lab will assess how TBI affects place cell information content, stability, and phase precession with better occupancy.

      (5) We have added Supplemental Table 1 that includes the total number of units recorded for each animal.

      (6) The spatial working memory deficit we report in the MWM is not a novel finding in this model of TBI. However, we wanted to ensure that <sub>L</sub>FPI in our hands at this injury level reproduced this known deficit. Importantly, the swim speed and distance traveled during testing did not differ between groups, suggesting that differences were not due to motor deficits. Additionally, the learning curves before sham/<sub>L</sub>FPI surgery were the same across groups. This data has been added to the manuscript in Supplementary Figure 10. While we did not test animals in a version of the task where the platform was visibly marked, previous studies have demonstrated that sham and injured rats perform comparably in a version of the MWM where the platform is visible or when a constant start location is used. These citations have been added to the manuscript.

      Reviewer #1 (Recommendations for the authors):

      For a more rigorous way of analyzing changes in hippocampal firing patterns across environments, see Wills et al 2005 for example.

      Addressed in point 4 above

      Spatial working memory tasks should always be compared with a control task to rule out confounding performance variables. Examples would be to use a variant of the MWM task that does not require the hippocampus such as using a visible escape platform.

      Addressed in point 6 above

      Statistics are typically reported including a t-statistic and degrees of freedom, not just the p-value. In addition, the authors should indicate whether the t-test is one or two-tailed.

      Addressed in point 3 above

      Reviewer #2 (Public review):

      Summary:

      The authors investigate changes in theta-gamma phase amplitude coupling, and action potential entrainment to theta following traumatic brain injury (TBI). Both phenomena are widely hypothesized to be important for cognition, and the authors report deficits in both after TBI. The manuscript is well-written, the figures are well-constructed, and the author's use of high-level analysis methods for TBI EEG data collected from awake, behaving animals is welcome.

      Major Comments:

      The animal n's are small (4 sham and 5 injured). In Figure 3, for instance, one wonders if panels D and E might have shown significant differences if more animals had been recorded.

      There are conflicting reports regarding the effect of <sub>L</sub>FPI on single cell firing rates. This is likely due to differential task demands and variations in <sub>L</sub>FPI severity across studies. We agree that the firing rates do appear to be trending; however, overall firing rate changes can be difficult to interpret. Because firing rates are influenced by behavior and brain state, we further separated firing rates into epochs when animals were moving or still and found similar trends that did not reach significance (data added in Supplementary Figure 7). We also assessed bursting in pyramidal cells to investigate whether potential changes in bursting influenced overall firing rates, and we found no differences between sham and injured animals across conditions (data added in Supplementary Figure 8). While the n’s are small when considered by animal, the number of units is actually fairly large, so if there were robust effects (as there were for the entrainment analyses), we would expect to see significant differences.

      The text focuses on deficits in the theta and gamma bands, but the reduction in power appears to be broadband (see Figure 1F, especially Pyramidal cell layer panel). Therefore, the overall decrease in broadband (in the injured population) must be normalized between sham and injured animals before a selective comparison between sham and injured animals can be conducted. That is the only way that selective narrow bands i.e., theta and low gamma can be compared between the two cohorts. A brief discussion of the significance of a broadband decrease would be appreciated.

      This is an excellent point that has now been addressed with the addition of Supplementary Figure 4. We used a well-established method (Donoghue et al 2020) to flatten power spectra in order to compare specific frequency bands in the context of a broadband shift. After applying this correction, we show that theta power is still reduced in injured rats compared to shams. While there is no difference in gamma power between groups in the corrected power spectra, this result should be interpreted with caution especially since there is not a large distinct peak in the gamma frequency range in the power spectrum of either sham or injured animals. However, if this is interpreted to mean that gamma power is not different between sham and injured animals, it makes the PAC data even more compelling. While there is clearly a broadband shift, the frequency range of this shift is still limited in the frequency domain to ~4-90Hz which contains physiologically relevant frequencies associated with synaptic currents. Importantly, the power spectra of sham and injured animals converge at low (<4Hz) and high (>100Hz) frequencies. This suggests that slow oscillations which could include delta and respiration-associated oscillations are not affected by TBI (though sleep recordings would be needed to properly address this). High-frequency activity can include ripples and HFOs which need to be separately extracted when comparing between groups due to their transient nature. However, overall spiking activity including the depolarizing spike and the after hyperpolarization significantly contribute to power in the high frequency range. Because this general high-frequency power is not different between groups, it suggests that the limited range of the broadband power reduction still contains important physiological signals. This broadband shift may result from a global reduction in or desynchronization of synaptic input to CA1. The specific mechanisms behind this broadband shift and the consequences it has on coding information in the hippocampus are fascinating questions that we hope will be specifically investigated in future studies. This point is now addressed in the Discussion.

      Reviewer #2 (Recommendations for the authors):

      Minor Comments:

      Please define your reference waveform for theta - is it theta recorded on the channel containing the cell? Average theta for all electrodes in SP? SP + SO? Theta for the nominal "St. pyr." channel? Please define.

      For all entrainment analyses, entrainment was measured referenced to the theta oscillation recorded from st. pyr. on the specific shank where the unit was detected. We added clarification in the results and methods sections regarding this point.

      Similarly, even though the peak of the theta wave appears from the figures to be taken as 0 degrees, please explicitly state this in the text.

      This has been added to the results and methods.

      Did the authors check for any difference between interneurons in SP and interneurons in SO?

      This is an excellent suggestion that we had hoped to investigate as it could inform whether specific interneuron populations were affected. However, we did not record enough units in st. ori to make this comparison.

      On page 8, Figures 3E and 3F are incorrectly labeled 4E and 4F.

      This has been fixed.

      Figure 1, panel C: please add a numerical scale to the colored scale bar.

      This has been added

      Figure 1, panel F: how was the significance between the frequency bands calculated?

      Statistics were done using a t-test at each frequency point with significance set at α=0.01 for multiple comparisons. This has been clarified in the figure legend and methods.

      Figure 3, panel A legend: Please add "Spike at 0 ms omitted for clarity.”

      This has been added

      Figure 4, panel A, right side: please provide the MVL for this cell, so that readers have a benchmark for evaluating the MVL as a parameter. A sample poorly entrained cell, with MVL, would also be informative.

      We added the MVL for this cell. We were unable to add a poorly entrained cell without making the figure more confusing.

      Raw data must be provided for the Morris Water Maze experiments described in Supplementary Figure 3.

      We added data showing no difference in the swim velocity or distance traveled between the sham and injured groups during memory testing as well as data showing that the two groups had similar learning curves during training before sham/injury surgery. See Supplementary Figure 10.

      Antibody 22C11 for APP has been shown to be non-specific when used for immunocytochemistry (it may be fine for Westerns). In addition, using a biotinylated secondary with an ABC kit for visualization risks contamination by post-injury changes in biotin. Reviewed in Xiong et al., 2023, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10580020/.

      As is standard practice in neuropathology, negative controls were run for all of these experiments (identical preparations minus the primary antibody.) No non-specific staining was present that could be mis-interpreted as APP-positive axonal profiles in either sham or injured tissue. While beyond the scope of this response, there are many reasons the authors of the cited paper may have had non-specific staining, including a concentration 450X that of the one utilized here and the absence of an antigen-retrieval technique in their protocol.

      Tummala et al. used in vivo calcium-imaging after TBI and also investigated single-cell activity in familiar and novel environments, and when moving or still. The authors could consider discussing their work.

      We have added a citation for this paper

      Reviewer #3 (Public review):

      Summary:

      In this study, the authors studied the effects of traumatic brain injury created by LFPI procedure on the CA1 at the network level. The major findings in this study seem to be that the TBI reduces theta and gamma powers in CA1, reduces phase-amplitude coupling in between theta and gamma bands as well as disrupts the gamma entrainment of interneurons. I think the authors have made some important discoveries that could help advance the understanding of TBI effects at the physiological level, however, more investigations into deciphering the relationship of the behavioral and brain states to the observed effects would help clarify the interpretations for the readers.

      Strengths:

      The authors in this study were able to combine behavioral verification of the TBI model with the laminar electrophysiological recordings of the CA1 region to bring forward network-level anomalies such as the temporal coordination of network-level oscillations as well as in the firing of the interneurons. Indeed, it seems that the findings may serve future studies to functionally better understand and/or refine the therapies for the TBI.

      Weaknesses:

      Discoveries made in the paper and their broad interpretations can be helped with further characterization and comparison among the brain and behavioral states both during immobility and movement. The impact of brain injury in several parts of the brain can alter brain-wide LFP and/or behavior. The altered behavior and/or LFP patterns might then lead to reduced spiking and unreliable LFP oscillations in the hippocampus. Hence, claims made in the abstract such as "These results reveal deficits in information encoding and retrieval schemes essential to cognition that likely underlie TBI-associated learning and memory impairments, and elucidate potential targets for future neuromodulation therapies" do not have enough evidence to test whether the disruptions were information encoding and retrieval related or due to sensorymotor and/or behavioral deficits that could also occur during TBI.

      Movement velocity is already known to be correlated to the entrainment of spikes with the theta rhythm and also in some cases with the gamma oscillations. So, it is important to disentangle the differences in behavioral variables and the observed effects. As an example, the author's claims of disrupted temporal coding (as shown in the graphical abstract) might have suffered from these confounds. The observed results of reduced entrainment might, on one hand, be due to the decreased LFP power (induced by injury in different brain areas) resulting in altered behavior and/or the unreliable oscillations of the LFP bands such as theta and gamma, rather than memory encoding and retrieval related disruption of spikes synchrony to the rhythms, while on the other hand, they may simply be due to reduced excitability in the neurons particularly in the behavioral and brain state in which the effects were observed, rather than disrupted temporal code. Hence, further investigations into dissociating these factors could help readers mechanistically understand the interesting results observed by the authors.

      We appreciate the Reviewer’s insights into disentangling the complex interactions between power, entrainment, and excitability, and have attempted to dissociate these further in our analyses. Regarding the broad effects of TBI, we agree that TBI affects many brain regions outside of the hippocampus as well as white matter pathways containing axons from areas where pathology is not visible, which likely results in widespread changes to LFPs across regions and altered behavior. Here we report disrupted network activity in the hippocampus which is likely a consequence of numerous pathologies across multiple brain regions. In the discussion, we speculate that disrupted power and coupling comes from desynchronization of inputs (especially those from the mEC and MS) as well as changes to local circuits within the hippocampus which combine to disrupt temporal coding. While the disrupted processes we report in the hippocampus are implicated in computational processes thought to support learning and memory, we acknowledge that results from this study do not causally reveal a specific mechanism that is directly responsible for cognitive impairments. We have changed the language of the quoted sentence from the abstract to make our claim less causal as we agree that the direct effects of these results on cognition are difficult to quantify due to the fact that animals were not performing a spatial navigation task with measurable outcomes during recordings. We have also removed the graphical abstract as we believe it is an oversimplification of the results given new analyses.

      Regarding the possible contribution of sensory and motor deficits or differences in behavioral states to the observed changes, we agree that it is essential to consider potential sensorimotor deficits as well as the animal’s behavioral state when comparing oscillations and single unit activity in the hippocampus, especially since these phenomena have been extensively liked to movement velocity and exploration. To address this, we have added Supplementary Figure 1 showing that there are no differences in movement velocity or exploration time between sham and injured animals. Because animals were simply foraging during electrophysiological experiments we do not expect there to be any major additional behavioral differences that would influence oscillations or spiking once locomotion is controlled for, though differences in attention or arousal cannot be ruled out. Additionally, analyses throughout the manuscript are performed independently during periods when animals were moving or still. Data in Figures 1 and 2 also only include data from the familiar environment to rule out any effects of novelty on hippocampal oscillations. Supplementary Figures 2 and 3 were added to demonstrate that TBI-associated reductions in power were consistent when animals were still and when a higher threshold for movement (>20 cm/sec) was used. Finally, supplementary Figure 10 was added showing no differences in swim velocity or distance traveled in the MWM between sham and injured animals, further suggesting that there are no significant sensorimotor deficits at this injury level and timepoint. Additionally, previous studies have demonstrated that sham and injured rats perform comparably in a version of the MWM where the platform is visible or when a constant start location is used, which provides further support that sensorimotor deficits are not responsible for memory deficits in this task (see above).

      Regarding the contribution of neuronal excitability to the reported changes, we agree that changes in the excitability of neurons could have a strong effect on entrainment. Importantly, we show that the disrupted oscillations recorded in the injured hippocampus do not coincide with significant changes in neuronal firing rates between sham and injured animals. We have added Supplementary Figure 7 demonstrating this holds true both when animals are still and when they are moving. Additionally, we have added Supplementary Figure 8 showing no differences in pyramidal cell bursting between sham and injured animals. While this suggests that there are not major changes in excitability, homeostatic plasticity mechanisms may impact firing rates and bursting, and the extent of these effects and their role on entrainment are unclear. This point was added to the Discussion.

      To address the effects of LFP power on entrainment strength, Figure 5 has been updated to show theta and gamma entrainment strength as well as theta-gamma PAC as a function of theta amplitude. We found that, during periods of comparable theta power, interneurons from sham and injured animals are similarly entrained to theta, but pyramidal cells from injured animals become significantly more entrained to theta than in shams. We address the potential implications of these results in the Discussion.

      Reviewer #3 (Recommendations for the authors):

      The authors have stated on page 7 and Figure 2E, "Taken together, injured rats show a decrease in the strength of theta-gamma PAC that is specific to st. pyr, and a shift in peak gamma amplitude to a later phase of theta in both st. pyr and st. rad". Is the shift in the peak position greater than expected by chance?

      We are unaware of a rigorous method that would allow us to compare this shift statistically. We have reported the observed shift and avoided calling the shift significant for that reason.

      The authors state on page 9 "cells (sham familiar=1.63{plus minus}0.23 Hz, n=51, injured familiar=2.11{plus minus}0.20 Hz, n=141, p=0.446; sham novel=1.84{plus minus}0.18 Hz, n=55, injured novel=2.23{plus minus}0.21 Hz, n=134, p=0.170; mean{plus minus}SEM; ks-test; Fig 4E) between sham and injured groups, but a higher percentage of pyramidal cells were active (firing rate >0.1Hz) in both the familiar and novel environment in injured rats compared to shams (sham=74%, injured=87%, p=0.025, Fisher's exact test; Fig 4F)." Do the authors mean Figures 3E and 3F respectively in place of Figures 4E and 4F?

      This has been fixed.

      Regarding the finding of similar firing rates and differences in the overlap of the neurons that were active in between injured and control animals, it is imperative to study the differences in behaviors of the animals. First of all, it seems appropriate to quantify and compare the immobility and mobile periods as well as the movement velocity of the animals in both groups. Then, it would be interesting to see if any behavioral variables correlate with the firing characteristics of the cells in both the sham and the injured animals. Since hippocampal cells have been known to have different levels of recruitment and firing rates according to different behavioral states such as movement velocity, some of the similarities or differences in neural findings might as well be attributed to the differences in behaviors in between the groups. However, some differences may be observed in the injured rats despite similar behavior and the LFP powers. In other words, studying the effects of injury during similar behavioral (e.g. firing rate as a function of movement velocity) and brain states (e.g. categorical effects of awake theta state, type two theta, and ripple states on firing rates and the entrainment) might help dissociate some effects that might only be due to difference in the behavior caused by the injury throughout the brain and might as well have less to do with specific injury induced local circuits level deficits in the hippocampus. The results in Figures 4, 5, and 6 reveal such interesting differences and hence, it becomes even more important to quantify and correlate behavioral states (movement velocity and theta/ripple) to the neuronal characteristics (LFP power, PAC, firing rates, and entrainment) presented in Figure 3.

      These are excellent points, and we have addressed them in the following ways:

      We added Supplementary Figure 1 demonstrating that there were no differences in movement velocity between sham and injured animals during electrophysiological recordings.

      Power and PAC analyses were done exclusively when the animal was moving to compare across similar behavioral states. Additionally, these analyses were constrained to recordings from the familiar environment to rule out any effects of novelty. Because animals were simply foraging during recordings we do not expect other behavioral factors besides movement velocity to play a major role in these processes. We have also added Supplementary Figures 2 and 3 which demonstrate that TBI-associated differences in oscillatory power follow similar trends when animals are still (Sup. Fig 2) or when a higher movement threshold (>20cm/sec) is used (Sup Fig 3). We also added Supplementary Figures 7 and 8 showing that there were no significant differences in firing rates or bursting while animals were still or while they were moving.

      The Discussion was expanded to discuss how TBI may disrupt circuits outside the hippocampus which may contribute to our findings. Additionally, we acknowledge the limitation that these recordings were not obtained while animals were doing a quantitatively measurable spatial navigation task which limits our ability to assess whether changes are truly behaviorally relevant.

      We have also updated Figure 5 to show entrainment across different levels of theta power.

      Elaborating on the abovementioned point, Figures 4B and 4E depict a finding that mean entrainment is reduced in the injured during immobility. The following factors may contribute to the results:

      (1) Reduction in theta power during immobility (reduced attention and/or LFP profile due to brain-wide injury), which makes theta cycles unreliable, which can contribute to the results.

      (2) Changes in neural firing properties during immobility, such as reduced burst rates or firing rates during immobility.

      (3) As the authors claimed in the graphical abstract, there might be an actual disruption of temporal code associated with the memory encoding. It would be awesome if the temporal disruption could be investigated during the comparable theta power and behavioral states. This analysis would test whether there is an unconfounded disruption in the temporal code in the hippocampus due to the injury. In any case, it would be ideal to isolate the epochs during sleep in which animals were in theta state and exclude ripple states to make a definitive assessment of the aforementioned factors. These further investigations would also help the interpretations made by authors in the discussion section such as "This can disrupt type II theta which occurs when animals are not actively moving and exploring the environment. We found that single unit entrainment to theta was substantially decreased in injured rats when they were not moving, a phenomenon not seen in shams, which suggests a disruption in type II theta. This provides further evidence that cholinergic signaling may be dysfunctional following TBI."

      (1) While theta power is reduced in injured animals, it can still be reliably detected even at rest. We added Supplementary Figure 2 showing power spectra while animals were not moving, and a distinct peak can be seen in the theta frequency range. Additionally, clear peaks in entrainment can be seen in the theta frequency band in Fig 4B while animals were still. This suggests that theta can still be reliably detected in injured animals even when they are not moving. However, we agree that reduced attention or arousal could contribute to these changes, and this point has been added to the Discussion.

      (2) We added Supplementary Figures 7 and 8 showing no differences in firing rates or bursting parameters between groups during periods of immobility.

      (3) We updated Figure 5 which now shows entrainment strength as a function of theta amplitude. We found that the theta entrainment strength of both pyramidal cells and interneurons increased with increasing theta amplitudes. We address potential implications of these changes in the Discussion.

      On page 10 the authors state, "theta entrainment strength drastically increased when rats began moving in injured but not sham animals." It is unclear if the effect was confined to the periods when rats started movement. Also, it would be of interest to investigate whether movement epochs and velocity were affected in the periods when the effects were observed.

      This was not confined to the exact points when the rats started moving. We removed the word “began” for clarity. See point regarding velocity above.

      On page 12 the authors state, "On test day, injured rats had a lower memory score than shams (sham=114.8 {plus minus} 21.8, n=9; injured=51.5{plus minus}6.8, n=14; p=0.020; mean {plus minus} SEM; Welch's t-test) indicating poor spatial memory (Sup Fig 3A)." The result is the validation of the TBI injury on a hippocampal-dependent Morris water maze task. However, it would be nice to see the quantification of the movement velocity in the water maze and the trajectory length in each group to further dissect whether animals were constrained in the movement and hence, they could not get to the platform or they forgot where it was located. Also, it would help to compare the rats' performance after sham or TBI surgeries to their performance during the training before the surgeries (assuming the data during the training periods were recorded as well).

      We have added Supplemental Figure 10 to include all of this information. Importantly, movement velocity and distance traveled were not different between groups on testing day, and the learning curves of both groups were the same before sham/injury surgery.

    1. eLife Assessment

      This important study details changes in the brain functional connectivity in a longitudinal cohort of Gambian children assessed outside a lab setup with functional near-infrared spectroscopy (fNIRS) from age 5 to 24 months, in relation to early physical growth and cognitive flexibility capacities at preschool age. Evidence supporting conclusions on the evolution of brain connectivity is convincing and highlights a different trajectory compared with populations from high-income countries. However, analyses linking connectivity trajectories with early adverse conditions such as undernutrition and later cognitive development are only partially supported due to insufficient longitudinal data and statistical power. This study will be of significant interest to neuroscientists, psychologists and neuroimaging researchers working on infant development in relation to environmental factors.

    2. Reviewer #1 (Public review):

      Summary:

      This study utilizes fNIRS to investigate the effects of undernutrition on functional connectivity patterns in infants from a rural population in Gambia. fNIRS resting-state data recording spanned ages 5 to 24 months, while growth measures were collected from birth to 24 months. Additionally, executive functioning tasks were administered at 3 or 5 years of age. The results show an increase in left and right frontal-middle and right frontal-posterior connections with age and, contrary to previous findings in high-income countries, a decrease in frontal interhemispheric connectivity. Restricted growth during the first months of life was associated with stronger frontal interhemispheric connectivity and weaker right frontal-posterior connectivity at 24 months of age. Additionally, the study describes some connectivity patterns, including stronger frontal interhemispheric connectivity, which is associated with better cognitive flexibility at preschool age.

      Strengths:

      - The study analyses longitudinal data from a large cohort (n = 204) of infants living in a rural area of Gambia. This already represents a large sample for most infant studies, and it is impressive, considering it was collected outside the lab in a population that is underrepresented in the literature. The research question regarding the effect of early nutritional deficiency on brain development is highly relevant and may highlight the importance of early interventions. The study may also encourage further research on different underrepresented infant populations (i.e., infants not residing in Western high-income countries) or in settings where fMRI is not feasible.

      - The preprocessing and analysis steps are carefully described, which is very welcome in the fNIRS field, where well-defined standards for preprocessing and analysis are still lacking.

      Weaknesses:

      - The study provides a solid description of the functional connectivity changes in the first two years of life at the group level and investigates how restricted growth influences connectivity patterns at 24 months. However, it does not explore the links between adverse situations and developmental trajectories for functional connectivity. Given the longitudinal nature of the dataset, future work should expand the analysis using more sophisticated tools to link undernutrition to specific developmental trajectories in functional connectivity, and eventually incorporate additional data to increase statistical power.

      - Connectivity was assessed in 6 big ROIs to reduce variability due to head size and optode placement. Nevertheless, this also implies a significant reduction in spatial resolution. Individual digitalisation and co-registration of the optodes to a head model, followed by image reconstruction, could provide better spatial resolution. This is not a weakness specific to this study but rather a limitation common to most fNIRS studies, which typically analyse data at the channel level since digitalisation and co-registration can be challenging, especially in complex setups like this. The authors made an important effort to identify subjects with major optode displacement; however, future work might use tools to digitally record the positions of optodes and head markers.

    3. Reviewer #2 (Public review):

      Strengths:

      The article addresses a topic of significant importance, focusing on early life growth faltering in low-income countries-a key marker of undernutrition-and its impact on brain functional connectivity (FC) and cognitive development. The study's strengths include the laborious data collection process, as well as the rigorous data preprocessing methods employed to ensure high data quality. The use of cutting-edge preprocessing techniques further enhances the reliability and validity of the findings, making this a valuable contribution to the field of developmental neuroscience and global health.

      Weaknesses:

      The study lacks specificity in identifying which specific brain networks are affected by growth faltering, as the current exploratory analyses mainly provide an overall conclusion that infant brain network development is impacted without pinpointing the precise neural mechanisms or networks involved.

    4. Reviewer #3 (Public review):

      Summary

      This study aimed to investigate whether the development of functional connectivity (FC) is modulated by early physical growth, and whether these might impact cognitive development in childhood. This question was investigated by studying a large group of infants (N=204) assessed in Gambia with fNIRS at 5 visits between 5 and 24 months of age. Given the complexity of data acquisition at these ages and following data processing, data could be analyzed for 53 to 97 infants per age group. FC was analyzed considering 6 ensembles of brain regions and thus 21 types of connections. Results suggested that: i) compared to previously studied groups, this group of Gambian infants have different FC trajectory, in particular with a change in frontal inter-hemispheric FC with age from positive to null values; ii) early physical growth, measured through weight-for-length z-scores from birth onwards, is associated with FC at 24 months. Some relationships were further observed between FC during the first two years and cognitive flexibility, in different ways between 4- and 5-year-old preschoolers, but results did not survive corrections for multiple comparisons.

      Strengths

      The question investigated in this article is important for understanding the role of early growth and undernutrition on brain and behavioral development in infants and children. The longitudinal approach considered is highly relevant to investigate neurodevelopmental trajectories. Furthermore, this study targets a little studied population from a low-/middle-income country, which was made possible by the use of fNIRS outside the lab environment. The collected dataset is thus impressive and it opens up a wide range of analytical possibilities.

      Weaknesses

      Data analyses were constrained by the limited number of children with longitudinal data on NIRS functional connectivity. Applying advanced statistical modeling approaches such as structural equation modelling would provide further insights on neurodevelopmental trajectories and relationships with early growth and later cognitive development.

    5. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This study utilises fNIRS to investigate the effects of undernutrition on functional connectivity patterns in infants from a rural population in Gambia. fNIRS resting-state data recording spanned ages 5 to 24 months, while growth measures were collected from birth to 24 months. Additionally, executive functioning tasks were administered at 3 or 5 years of age. The results show an increase in left and right frontal-middle and right frontal-posterior connections with age and, contrary to previous findings in high-income countries, a decrease in frontal interhemispheric connectivity. Restricted growth during the first months of life was associated with stronger frontal interhemispheric connectivity and weaker right frontal-posterior connectivity at 24 months of age. Additionally, the study describes some connectivity patterns, including stronger frontal interhemispheric connectivity, which is associated with better cognitive flexibility at preschool age.

      Strengths:

      The study analyses longitudinal data from a large cohort (n = 204) of infants living in a rural area of Gambia. This already represents a large sample for most infant studies, and it is impressive, considering it was collected outside the lab in a population that is underrepresented in the literature. The research question regarding the effect of early nutritional deficiency on brain development is highly relevant and may highlight the importance of early interventions. The study may also encourage further research on different underrepresented infant populations (i.e., infants not residing in Western high-income countries) or in settings where fMRI is not feasible.

      The preprocessing and analysis steps are carefully described, which is very welcome in the fNIRS field, where well-defined standards for preprocessing and analysis are still lacking.

      We thank the reviewer for highlighting the strengths of this work.

      Weaknesses:

      While the study provides a solid description of the functional connectivity changes in the first two years of life at the group level and investigates how restricted growth influences connectivity patterns at 24 months, it does not explore the links between adverse situations and developmental trajectories for functional connectivity. Considering the longitudinal nature of the dataset, it would have been interesting to apply more sophisticated analytical tools to link undernutrition to specific developmental trajectories in functional connectivity. The authors mention that they lack the statistical power to separate infants into groups according to their growing profiles. However, I wonder if this aspect could not have been better explored using other modelling strategies and dimensional reduction techniques. I can think about methods such as partial least squares correlation, with age included as a numerical variable and measures of undernutrition.

      We agree with the reviewer that this complex and rich longitudinal dataset would benefit from more sophisticated analytical approaches to characterise developmental trajectories in functional connectivity and to more directly link them to measures of undernutrition. However, conducting such analyses would require substantial additional methodological development, model validation, and careful interpretation, which fall beyond the scope and timeline of the present manuscript. Our aim here was to provide a clear and robust characterisation of functional connectivity changes during the first two years of life and to examine associations with growth outcomes at a specific developmental stage, while ensuring methodological transparency and statistical reliability. Importantly, these more advanced trajectory-based analyses are currently being pursued in the final phase of the BRIGHT project (BRIGHT IMPACT), in collaboration with expert statisticians and data scientists. This ongoing work aims specifically to leverage the longitudinal richness of the dataset to model developmental trajectories and their associations with early-life adversity and nutritional factors. We therefore see the present study as an important foundation for these forthcoming analyses.

      Connectivity was assessed in 6 big ROIs. While the authors justify this choice to reduce variability due to head size and optodes placement, this also implies a significant reduction in spatial resolution. Individual digitalisation and co-registration of the optodes to the head model, followed by image reconstruction, could have provided better spatial resolution. This is not a weakness specific to this study but rather a limitation common to most fNIRS studies, which typically analyse data at the channel level since digitalisation and co-registration can be challenging, especially in complex setups like this. However, the BRIGHT project has demonstrated that it is possible and that differences in placement affect activation patterns, which become more localised when data is co-registered at the subject level (Collins-Jones et al., 2021). Could the co-registration of individual data have increased sensitivity, particularly given that longitudinal effects are being investigated?

      We agree with the reviewer that the fNIRS community should work toward more precise methods for spatial registration of optodes, not only at the group level but also at the subject level, in order to make more precise inferences about the locations of activations. However, we followed a very thorough offline procedure to model headgear placement based on each participant’s photographs, which we believe complements the coregistration work performed by Collins-Jones in 2021. As reported in the fNIRS data acquisition section “Infants were excluded from further analysis if the band was excessively high over the front above the eyebrows” (line 409, methods section). Moreover channels displacement was measured from the photos, and if it was “equal or greater than 1.6 cm were renumbered, so that each channel was shifted either backward or forward one full channel location in space” (line 413, methods section). While these practices are thoroughly followed in the BRIGHT project, we are aware that they are not part of the standard procedure in many infant fNIRS studies. We hope that this work provides guidance for other researchers on how to coregister infant fNIRS data.

      Considering the spatial resolution of fNIRS, which is on the order of centimetres, and the thorough procedure combining fNIRS–MRI coregistration with channel displacement assessment based on photographs, we do not think that individual-level coregistration would have significantly increased the sensitivity of the results.

      I believe that a further discussion in the manuscript on the application of global signal regression and its effects could have been beneficial for future research and for readers to better understand the negative correlations described in the results. Since systemic physiological changes affect HbO/HbR concentrations, resulting in an overestimation of functional connectivity, regressing the global signal before connectivity computation is a common strategy in fNIRS and fMRI studies. However, the recommendation for this step remains controversial, likely depending on the case (Murphy & Fox, 2017). I understand that different reasons justify its application in the current study. In addition to systemic physiological changes originating from brain tissue, fNIRS recordings are contaminated by changes occurring in superficial layers (i.e., the scalp and skull). While having short-distance channels could have helped to quantify extracerebral changes, challenges exist in using them in infant populations, especially in a longitudinal study such as the one presented here. The optimal source-detector distance that minimises sensitivity to changes originating from the brain would increase with head size, and very young participants would require significantly shorter source-detector distances (Brigadoi & Cooper, 2015). Thus, having them would have been challenging. Under these circumstances (i.e., lack of short channels and external physiological measures), and considering that the amount the signal is affected by physiological noise (either coming from the brain or superficial tissue) might change through development, the choice of applying global signal regression is justified. Nevertheless, since the method introduces negative correlations in the data by forcing connectivity to average to zero, I believe a further discussion of these points would have enriched the interpretation of the results.

      We added a paragraph discussing the choice of using GSR in our pipeline in the discussion of the manuscript as follows: “Importantly, these results remained significant even without GSR, indicating that our findings are not solely driven by preprocessing choices. While the use of GSR in FC studies remains debated (Murphy & Fox, 2017), in the absence of short channels (which are difficult to use reliably with infants (Emberson et al., 2016)) and external physiological measures, applying GSR represented the most appropriate preprocessing option. In fact, failure to correct for systemic physiological fluctuations can, in fact, lead to artificially elevated connectivity estimates in fNIRS data (Abdalmalak et al., 2022)” (line 250, discussion section).

      Reviewer #2 (Public review):

      Strengths:

      The article addresses a topic of significant importance, focusing on early life growth faltering in low-income countries-a key marker of undernutrition-and its impact on brain functional connectivity (FC) and cognitive development. The study's strengths include the laborious data collection process, as well as the rigorous data preprocessing methods employed to ensure high data quality. The use of cutting-edge preprocessing techniques further enhances the reliability and validity of the findings, making this a valuable contribution to the field of developmental neuroscience and global health.

      We thank the reviewer for highlighting the strengths of this work.

      Weaknesses:

      The study fails to fully leverage its longitudinal design to explore neurodevelopmental changes or trajectories, as highlighted by all three reviewers. The revised manuscript still primarily focuses on FC values at a single age stage (i.e., 24 months) rather than utilizing the longitudinal data to investigate how FC evolves over time or predicts cognitive development. Although the authors acknowledge that analyzing changes in FC (ΔFC) would reduce degrees of freedom (to ~30) and risk interpretability, they do not report or discuss these results, even as exploratory findings.

      As suggested, we added the table reporting the results of the associations between changes in functional connectivity (DFC) between 5 and 24 months and cognitive flexibility in the supplementary materials (Table SI3). We additionally explored the relationship between changes in growth and cognitive flexibility as suggested by Reviewer #3 and we reported these additional analyses in the text as follows: “We also explored whether changes in growth and changes in functional connectivity between 5 and 24 months were associated with cognitive flexibility at preschool age, but we did not find any significant association (Table SI3 and Table SI4).” (line 213, results section).

      Furthermore, the study lacks specificity in identifying which specific brain networks are affected by growth faltering, as the current exploratory analyses mainly provide an overall conclusion that infant brain network development is impacted without pinpointing the precise neural mechanisms or networks involved.

      We added this limitation in the discussion as follows: “While the impact of undernutrition on brain development has been documented in LMICs (46), herein, we provided empirical evidence that growth faltering specifically in infants younger than five months of age impacts observable development of functional brain networks in the second year of life. Future studies may be needed to pinpoint which specific brain networks are impacted” (line 279, discussion section).

      Reviewer #3 (Public review):

      Summary

      This study aimed to investigate whether the development of functional connectivity (FC) is modulated by early physical growth, and whether these might impact cognitive development in childhood. This question was investigated by studying a large group of infants (N=204) assessed in Gambia with fNIRS at 5 visits between 5 and 24 months of age. Given the complexity of data acquisition at these ages and following data processing, data could be analyzed for 53 to 97 infants per age group. FC was analyzed considering 6 ensembles of brain regions and thus 21 types of connections. Results suggested that: i) compared to previously studied groups, this group of Gambian infants have different FC trajectory, in particular with a change in frontal inter-hemispheric FC with age from positive to null values; ii) early physical growth, measured through weight-for-length z-scores from birth on, is associated with FC at 24 months. Some relationships were further observed between FC during the first two years and cognitive flexibility, in different ways between 4- and 5-year-old preschoolers, but results did not survive corrections for multiple comparisons.

      Strengths

      The question investigated in this article is important for understanding the role of early growth and undernutrition on brain and behavioral development in infants and children. The longitudinal approach considered is highly relevant to investigate neurodevelopmental trajectories. Furthermore, this study targets a little studied population from a low-/middle-income country, which was made possible by the use of fNIRS outside the lab environment. The collected dataset is thus impressive and it opens up a wide range of analytical possibilities.

      We thank the reviewer for highlighting the strengths of this work.

      Weaknesses

      Data analyses were constrained by the limited number of children with longitudinal data on NIRS functional connectivity. Nevertheless, considering more advanced statistical modelling approaches would be relevant to further explore neurodevelopmental trajectories as well as relationships with early growth and later cognitive development.

      While in this study we selected specific FC and outcome variables based on our hypothesis, the final phase of the BRIGHT project, known as BRIGHT IMPACT, aims to apply advanced statistical models to integrate a range of project variables into a single comprehensive analysis. We have acknowledged this in the discussion as follows: “Applying more advanced statistical modelling methods and structural equation modelling analyses may provide greater insight with further investigations in contexts of adversity and, in turn, establish which outcomes are predicted by FC” (line 309, discussion section).

      The abstract and end of the discussion should make it clearer that the associations between FC and cognitive flexibility are results that need to be confirmed, insofar as they did not survive correction for multiple comparisons.

      We have acknowledged this in the abstract as follows: “Our results highlight the measurable effects that poor growth in early infancy has on brain development and the possible subsequent impact on pre-school age cognitive development, underscoring the need for early life interventions throughout global settings of adversity”.

      We have acknowledged this in the discussion as follows: “While our results are consistent with previous studies, we acknowledge that the significant associations between early FC and later cognitive flexibility do not withstand multiple comparisons. Therefore, we encourage future studies that may replicate these findings with a larger sample” (line 300, discussion section).

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) In Figure 1 B and C the authors should indicate that the results refer to HbO.

      We have added the suggested specification in the caption of the figure as suggested.

      (2) Figure SI2. Please indicate in the caption that these are the results when pre-processing did not include global signal regression.

      We have added the suggested specification in the caption of the figure as suggested.

      Reviewer #3 (Recommendations for the authors):

      (1) The sentence l529-531 ("To investigate whether FC early in life predicted...") should be more explicit as it is not clear which of the two variables is regressed by the other: is it the measure of cognitive flexibility that is regressed by FC, as the hypothesis suggests? Were other variables considered in the regression model? (For linear regression with only one "prediction" variable, the square root of the coefficient of determination 𝑅2 is equal to the correlation between the two variables.)

      Yes, it is the measure of cognitive flexibility that is regressed by FC. We have rephrased it in the text as follows: “we regressed later cognitive flexibility against FC that showed a significant change across the first two years of life”. There were no other variables in the regression model.

      (2) A summary table of the statistical results for FC-cognitive flexibility associations should be included as for other analyses, in addition to Figure 3B.

      We added a table of the results for the association between FC and cognitive flexibility in the supplementary materials (Table SI2, page 10), matching the same colours of Table 2. We referenced the table in the text in the main manuscript (line 211, result section).

      (3) Figure 3B: The legend should precise that these results did not survive corrections for multiple comparisons.

      We have specified this in the legend of Figure 3 as suggested.

      (4) For the young pre-schooler group, it seems that the age is around 4 years (age mean +/- SD=47.96 +/- 2.77 months) and not 3 years as indicated at several places in the manuscript.

      We found only once instance in which we erroneously said that the younger preschoolers were around 3 years. We replaced “Gambian infants from BRIGHT were cross-sectionally assessed at the age of 3 or 5 years for cognitive flexibility” with Gambian infants from BRIGHT were cross-sectionally assessed between the age of 3 and 5 years for cognitive flexibility (line 489, method section).

      (5) The authors use the term "intra-hemispheric" connections for the ones within each of the 6 sections. This might be misleading since fronto-posterior connections are also intra-hemispheric ones. Specifying "short-range" or "within-section" connections might be clearer.

      As suggested by the reviewer, we replaced “intra-hemispheric” with “intra-hemispheric within section” where appropriate through the whole manuscript.

      (6) Abstract: what is the justification for using the term "optimal" for describing developmental trajectories of FC?

      The term “optimal” refers to knowledge about typical developmental trajectories, coming especially from fMRI studies, as mentioned in the introduction: “Based on data from fMRI, current models hypothesize that FC patterns mature throughout early development (23–27), where in typically developing brains, adult-like networks emerge over the first years of life as long-range functional connections between pre-frontal, parietal, temporal, and occipital regions become stronger and more selective (28–31). [...]. Importantly, normative developmental patterns may be disrupted and even reversed in clinical conditions that impact development; e.g., increased short-range and reduced long-range FC have been observed in preterm infants (36) and in children with autism spectrum disorder (37, 38)” (line 93-106, introduction).

      (7) The confidence interval should be added in Figure SI3.

      As suggested, confidence intervals have been added in Figure SI3.

      (8) Other scatterplot examples of associations might be added as supplementary information.

      As suggested, we added several additional scatterplots to Figure SI3 (with confidence intervals as noted in the comment above) to show other associations between changes in growth and FC at 24 months.

      (9) Figure SI6: % in x-axis is still indicated.

      We apology for the oversight, all the percentage signs have now been removed from the x-axis tick labels.

      (10) The authors might show the (even not significant) results of the associations between changes in growth and cognitive flexibility in supplementary information.

      As suggested, we added the table reporting the results of the associations between changes in growth (DWLZ) and cognitive flexibility in the supplementary materials (Table SI3). We additionally explored the relationship between changes in functional connectivity and cognitive flexibility as suggested by Reviewer #2 and we reported these additional analyses in the text as follows: “We also explored whether changes in growth and changes in functional connectivity between 5 and 24 months were associated with cognitive flexibility at preschool age, but we did not find any significant association (Table SI3 and Table SI4).” (line 213, results section).

    1. eLife Assessment

      Hoverflies are known for their sexually dimorphic visual systems and exquisite flight behaviors. This valuable study reports how two types of visual descending neurons differ between males and females in their motion- and speed-dependent responses, yet surprisingly, the behavior they control lacks any sexual dimorphism. The results convincingly support these findings, which will be of interest for studies of visuomotor transformations and network-level brain organization.

    2. Reviewer #1 (Public review):

      Summary:

      Hoverflies are renowned for their striking sexual dimorphism in eye morphology and early visual system physiology, as well as in sexually dimorphic behaviors. Surprisingly, male and female flight behaviors in response to optic flow exhibit only subtle differences. Nicholas et al. investigate the sensorimotor transformation of sexually dimorphic visual information into flight steering commands via descending neurons. Using a combination of intracellular and extracellular recordings, neuroanatomical analysis, and behavioral assays, the authors convincingly demonstrate that descending neurons-particularly at high optic flow velocities-exhibit pronounced sexual dimorphisms, while wing steering responses remain largely monomorphic. The study highlights a very interesting discrepancy between neuronal and behavioral response properties.

      More specifically, the authors focused on two types of descending neurons that receive inputs from well-characterized wide-field sensitive tangential cells: OFS DN1 and OFS DN2. Their likely counterparts in Drosophila connect to neck, wing and haltere neuropils. The authors characterized the visual response properties of these two neuronal classes in both male and female hoverflies and identified several interesting differences. They then presented the same set of stimuli, tracked wing beat amplitude and analyzed the sum and the difference of right and left wing beat amplitude as a readout of lift or thrust, and yaw turning, respectively. Behavioral responses showed little to no sexual dimorphism, despite the observed neuronal differences.

      Strengths:

      I find the question very interesting and the results both convincing and intriguing. A fundamental goal in neuroscience is to link neuronal responses and behavior. The current study highlights that the transformations - even at the level of descending neurons to motoneurons - is complex and less straightforward than one might expect.

      Weaknesses:

      The authors investigated two types of descending neurons, but it was not clear to me how many other descending neurons are thought to be involved in wing steering responses to wide-field motion. I would suggest providing a more in-depth overview of what is known in hoverflies and Drosophila, since the conclusions drawn from the study would be different if these two types were the only descending neurons involved, as opposed to representing a subset of the neurons conveying visual information to the wing neuropil.

      Both neuronal classes have counterparts in Drosophila that also innervate neck motor regions. The authors filled hoverfly DNs in intracellular recordings to characterize their arborization in the ventral nerve cord. In my opinion, these anatomical data could be further exploited and discussed a bit more: is the innervation in hoverflies also consistent with connecting to the neck and haltere motor regions? Are there any obvious differences and similarities to the Drosophila neurons mentioned by the authors? If the arborization also supports a role in neck movements, the authors could discuss whether they would expect any sexual dimorphism in head movements.

      Revision comment:

      I thank the authors for their detailed replies to my questions and the additional clarifications and analysis included in the paper. All my concerns have been addressed.

    3. Reviewer #2 (Public review):

      Summary:

      Many fly species exhibit male-specific visual behaviors during courtship while little is known about the circuit underlying the dimorphic visuomotor transformations. Nicholas et al focus on two types of visual descending neurons (DNs) in hoverflies, a species in which only males exhibit high-speed pursuit of conspecifics. They combined electrophysiology and behavior analysis to identify these DNs and characterize their response to a variety of visual stimuli in both male and female flies. The results show that the neurons in both sexes have similar receptive fields but exhibit speed-dependent dimorphic responses to different optic flow stimuli.

      Strengths:

      Hoverflies, though not a common model system, show very interesting dimorphic behaviors and provide a unique and valuable entry point to explore the brain organization behind sexual dimorphism. The findings here are not only interesting on their own right but will also likely inspire those working in other systems, particularly Drosophila.

      The authors employed rigorous morphology, electrophysiology, and behavior methods to deliver comprehensive characterization of the neurons in question. The precision of the measurements allowed for identifying a subtle and nuanced neuronal dimorphism and set a standard for future work in this area.

      Weaknesses:

      I'd like to thank the authors for the revised manuscript, especially the new analyses and figures. Most of my earlier concerns have been satisfactorily addressed by now. Interested readers are kindly referred to the authors' responses for the discussion of the limitations of this work.

    4. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      Hoverflies are known for their sexually dimorphic visual systems and exquisite flight behaviors. This valuable study reports how two types of visual descending neurons differ between males and females in their motion- and speed-dependent responses, yet surprisingly, the behavior they control lacks any sexual dimorphism. The results convincingly support these findings, which will be of interest for studies of visuomotor transformations and network-level brain organization.

      This statement perfectly recapitulates our findings.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Hoverflies are known for a striking sexual dimorphism in eye morphology and early visual system physiology. Surprisingly, the male and female flight behaviors show only subtle differences. Nicholas et al. investigate the sensori-motor transformation of sexually dimorphic visual information to flight steering commands via descending neurons. The authors combined intra- and extracellular recordings, neuroanatomy, and behavioral analysis. They convincingly demonstrate that descending neurons show sexual dimorphisms - in particular at high optic flow velocities - while wing steering responses seem relatively monomorphic. The study highlights a very interesting discrepancy between neuronal and behavioral response properties.

      Thank you for this summary. Most of the statement perfectly recapitulates the main findings of our paper. However, we want to emphasize that some hoverfly flight behaviors are strongly sexually dimorphic, especially those related to courtship and mating. Indeed, only male hoverflies pursue targets at high speed, chase away territorial intruders, and pursue females for mating. However, other flight behaviours, such as those related to optomotor responses and flights between flowers when feeding, are not sexually dimorphic. We have amended the Introduction and Discussion to make the difference between flight behaviors more clear. Please see lines 77 and 305 onwards.

      More specifically, the authors focused on two types of descending neurons that receive inputs from well-characterized wide-field sensitive tangential cells: OFS DN1, which receives inputs from so-called HS cells, and OFS DN2, which receives input from a set of VS cells. Their likely counterparts in Drosophila connect to the neck, wing, and haltere neuropils. The authors characterized the visual response properties of these two neuronal classes in both male and female hoverflies and identified several interesting differences. They then presented the same set of stimuli, tracked wing beat amplitude, and analyzed the sum and the difference of right and left wing beat amplitude as a readout of lift or thrust, and yaw turning, respectively. Behavioral responses showed little to no sexual dimorphism, despite the observed neuronal differences.

      Thank you for this very nice summary of our work. We want to clarify that LPTC input to DN1 and DN2 has not been shown directly in hoverflies using e.g. dye coupling, or dual recordings. Instead, the presumed HS and VS input is inferred from morphological and physiological DN evidence, and comparisons to similar data in Drosophila and blowflies. We have amended the Introduction to clarify this. Please see line 64 onwards. The rest of the paragraph perfectly recapitulates the main findings of our paper.

      Strengths:

      I find the question very interesting and the results both convincing and intriguing. A fundamental goal in neuroscience is to link neuronal responses and behavior. The current study highlights that the transformations - even at the level of descending neurons to motoneurons - are complex and less straightforward than one might expect.

      Thank you.

      Weaknesses:

      The authors investigated two types of descending neurons, but it was not clear to me how many other descending neurons are thought to be involved in wing steering responses to wide-field motion. I would suggest providing a more in-depth overview of what is known about hoverflies and Drosophila, since the conclusions drawn from the study would be different if these two types were the only descending neurons involved, as opposed to representing a subset of the neurons conveying visual information to the wing neuropil.

      This is a great point. There are around 1000 fly descending neurons identified in Drosophila, of which many could respond to widefield motion, without being specifically tuned to widefield motion. In Drosophila, at least 35 descending neuron types receive input in the part of the brain where the LPTC outputs are located, and at least 29 descending neuron types project to the wing motor neuropil. Thus, it is more than likely that other neurons project visual widefield motion information to the wing neuropil. Furthermore, we only measured wing beat amplitude (WBA) as seen in the horizontal plane, as we were filming from above. As such, other wing angle changes and rotations are not quantified. We have amended our Introduction (see line 53 onwards) and Discussion (see line 320 onwards) to address these important points.

      Both neuronal classes have counterparts in Drosophila that also innervate neck motor regions. The authors filled the hoverfly DNs in intracellular recordings to characterize their arborization in the ventral nerve cord. In my opinion, these anatomical data could be further exploited and discussed a bit more: is the innervation in hoverflies also consistent with connecting to the neck and haltere motor regions? Are there any obvious differences and similarities to the Drosophila neurons mentioned by the authors? If the arborization also supports a role in neck movements, the authors could discuss whether they would expect any sexual dimorphism in head movements.

      These are all great points. We did not see any clear arborizations to the frontal nerve (FN), where we would expect to find the neck motor neurons (NMNs). In addition, while we did see fine arborizations throughout the length of the thoracic ganglion, we saw no strong outputs projecting directly to the haltere nerve (HN). In the revised version of the MS we have modified figure 4 (morphological characterization) to show a magnification of the thoracic ganglion to clarify this.

      There are important differences between the morphology of DN1 and DN2 in hoverflies and DNHS1 and DNOVS2 in Drosophila, in terms of their projections in the thoracic ganglion. For example, In Drosophila DNOVS2, there are several fine branches along the length of the neuron in the thoracic ganglia. Similarly, we found fine branches in Eristalis tenax DN2, however, in addition, we found a wide branch projecting to the area of the thoracic ganglion where the prothoracic and pterothoracic nerves likely get their inputs, which we also found in Eristalis tenax OFS DN1 (Figure 4). This suggests that both neurons could contribute to controlling the wings and/or the forelegs (which is why we quantified the WBA). In Drosophila DNOVS1, there is a similar fat branch to the prothoracic and pterothoracic nerves, Furthermore, while Drosophila DNHS1 and DNOVS2 have different morphology, DN1 and DN2 in Eristalis looked similar. We have modified the Results section to make this clear, see line 193 onwards.

      In addition, to investigate this further, our revised version of the MS includes analysis of the movement of different body parts (the head angle, fore- and hindleg extension) to investigate this further, and to look for sexual dimorphism. Unfortunately, however, this did not include the halteres, as they cannot be seen well in the videos. The new data can be seen in Figure 7.

      Reviewer #2 (Public review):

      Summary:

      Many fly species exhibit male-specific visual behaviors during courtship, while little is known about the circuit underlying the dimorphic visuomotor transformations. Nicholas et al focus on two types of visual descending neurons (DNs) in hoverflies, a species in which only males exhibit high-speed pursuit of conspecifics. They combined electrophysiology and behavior analysis to identify these DNs and characterize their response to a variety of visual stimuli in both male and female flies. The results show that the neurons in both sexes have similar receptive fields but exhibit speed-dependent dimorphic responses to different optic flow stimuli.

      This statement perfectly recapitulates the main findings of our paper. As mentioned above, while hoverfly flight behaviors related to courtship and mating are strongly sexually dimorphic, other flight behaviours, such as those related to optomotor responses and flights between flowers when feeding, are not. We have amended the Introduction and Discussion to make the difference between flight behaviors more clear. Please see lines 77 and 305 onwards.

      Strengths:

      Hoverflies, though not a common model system, show very interesting dimorphic behaviors and provide a unique and valuable entry point to explore the brain organization behind sexual dimorphism. The findings here are not only interesting on their own right but will also likely inspire those working in other systems, particularly Drosophila.

      Thank you.

      The authors employed rigorous morphology, electrophysiology, and behavior methods to deliver a comprehensive characterization of the neurons in question. The precision of the measurements allowed for identifying a subtle and nuanced neuronal dimorphism and set a standard for future work in this area.

      Thank you.

      Weaknesses:

      Cell-typing using receptive field preferred directions (RFPDs): if I understood correctly, this classification method mostly relies on the LPDs near the center of the receptive field (median within the contour in Fig.1). I have two concerns here. First, this method is great if we are certain there are only two types of visual DNs as described in the manuscript. But how certain is this? Given the importance of vision in flight control, I would expect many DNs that transmit optic flow information to the motor center. I'd also like to point out that there are other lobula plate tangential cells (LPTCs) than HS and VS cells, which are much less studied and could potentially contribute to dimorphic behaviors.

      This is very true, and important. As mentioned above, in Drosophila there are 35 descending neuron types with inputs on the dorsal surface of the brain (labelled DNp1-35), suggesting that they could receive input from LPTCs. However, only 3 of these have been shown physiologically and morphologically to receive LPTC input, in blowflies and Drosophila (DNHS1, DNOVS1, DNOVS2). Note that in both blowflies and fruitflies DNOVS1 gives graded responses, and no action potentials, meaning that we would not be able to record from it using extracellular electrophysiology.

      We previously used clustering techniques to show that in Eristalis, we can reliably distinguish two types of optic flow sensitive DNs from extracellular electrophysiological data, based on a range of receptive field parameters, and we think that these correspond to DNHS1 and DNOVS2 in Drosophila (Nicholas et al, J Comp Physiol A, 2020, cited in paper). As mentioned above in response to Reviewer 1, this does not mean that there are no other neurons that could respond to widefield optic flow, and which might be involved in the WBA we recorded in the paper. However, the point of this paper was not to conclusively show that there are only two optic flow sensitive descending neurons. The point was to say that there are two quite distinct optic flow sensitive neurons that have similar receptive fields in males and females, while their velocity response functions differ between males and females.

      We have modified the Introduction (see lines 53 and 64 onwards) and Discussion to make these important points clear to the Reader, including a mention of the 45-60 LPTCs that exist in the lobula plate, and what their role might be.

      Second, this method feels somewhat impoverished given the richness of the data. The authors have nicely mapped out the directional tuning for almost the entire visual field. Instead of reducing this measurement to 2 values (center and direction), I was wondering if there is a better method to fully utilize the data at hand to get a better characterization of these DNs. As the authors are aware, local features alone can be ambiguous in characterizing optic flows. What's more, taking into account more global features can be useful for discovering potentially new cell types.

      This is a great point, and we did analyse other receptive field properties in this study (shown in previous supp fig 1). In addition, and as mentioned above, we have published a clustering analysis across receptive field properties of these neurons (Nicholas et al, J Comp Physiol A, 2020, cited in paper). The point that we attempted to make in this paper was that by using two strikingly simple metrics, we can reliably distinguish which of the two neuron types we are recording from simply based on azimuthal location and overall directional preference. This makes automated analysis very straightforward. Indeed, we now use this routinely to ID what neuron we are recording from computationally, rather than making a human-based assumption.

      However, we agree that this needs to be shown, and that further in depth analysis was warranted. Therefore, we have provided additional receptive field analysis and clustering (see new supplementary figure 1) and associated text. We also want to highlight that all data is uploaded to Data Dryad for anyone interested in doing additional in-depth analyses.

      Line 131, it wasn't clear to me why full-screen stimuli were used for comparison here, instead of the full receptive field maps. Male flies exhibit sexual dimorphic behaviors only during courtship, which would suggest that small-sized visual stimuli (mimicking an intruder or female conspecific) would be better suited to elicit dimorphic neuronal responses. A similar comment applies to the later results as well. Based on the receptive field mapping in Figure 1, I'm under the impression that these 2 DN types are more suited to detect wide-field optic flows, those induced by self-motion as mentioned in the manuscript. The results are still very interesting, but it's good to make this point clear early on to help set appropriate expectations. Conversely, this would also suggest that there are other visual DN types that are responsible for the courtship-related sexually dimorphic behaviors.

      Thank you for mentioning these important points. Our reasoning for using full-screen stimuli for the analysis on line 131 was that since we used the small sinusoidal gratings for mapping the receptive fields, and to subsequently classify the neurons, it would be unfair to use the same data to investigate potential sexual dimorphism. I.e., we selected neurons that fulfilled certain criteria, and then we cannot rightfully use the same criteria to determine differences. This was not explicitly mentioned in the paper, so we have modified the text to make this clear to the Reader, see lines 142 onwards.

      However, in Supp Figure 2d/e we show that there are no striking receptive field differences between males and females in terms of receptive field center nor directional preference. In Supp Figure 2f we also show that there is no difference between male and female receptive field height and width. We have modified the text to draw the Reader’s attention to this figure, and also mention the additional analysis done in response to the comment above.

      As a side note, I personally expected at least DN1 to have a smaller receptive field in males, as the hoverfly HSN is strikingly sexually dimorphic (Nordström et al, Curr Biol 2008). However, while optic flow sensitive DNs do respond to small objects (see e.g. the J Comp Physiol paper mentioned above) we did not detect any obvious sexual dimorphism in receptive field properties. Indeed, we think that a different subset of DNs control parts of target pursuit behavior (target selective DNs (TSDNs)). This is now addressed in the modified version of the paper, see line 89-92.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) I think that the additional measurement of head turns in response to some of the stimuli that showed the strongest sexual dimorphism would be very interesting, but I fully acknowledge that this might be beyond the scope of the current paper or technically too challenging, requiring additional cameras and a whole new tracking software, etc.

      We have added an additional figure to the paper, with associated text, showing the response of the head, fore- and hindlegs to the same stimuli, as far as we could extract them with only one camera filming from above. The new data can be found in the new figure 7, and associated text.

      (2) Are the onset measurements for WBD comparable across flight manoeuvres, given that they are limited to a single projection plane?

      This is a great point, and we have now added this caveat in the text, see line 261-262.

      (3) Line 62 - typo: DNp15 not NDp15.

      Thank you, fixed.

      Reviewer #2 (Recommendations for the authors):

      (1) Related to a comment earlier, in the Introduction, it is mentioned that there are 3 optic flow-sensitive DNs in Drosophila and blowfly. However, I don't see convincing evidence for this in the cited references, none of which have exclusively surveyed all the DNs.

      We have revised this to say that 3 neuron have been identified morphologically and physiologically, but that does not mean that there are no others. Please see line 60 onwards.

      (2) Line 142 and Supplementary Figure 3, this is stated in the next section, but I think it's better to make it clear that DN2 in females has a higher spontaneous rate before mentioning the starfield. Please also specify if the stationary starfield affects the DN2 rate at all in the female flies.

      Great points. We now describe the spontaneous rate before mentioning the responses to moving starfield stimuli, and highlight that there is no difference between no stimulus (pre-stimulation) and a stationary stimulus. Please see lines 155 onwards.

      (3) Line 34, 'redress' should be 'to address'.

      Thank you, fixed.

      (4) Line 59, a bit unclear to me what this sentence is trying to say. Also, I wouldn't say LPTCs are 'indirect' in the sensorimotor transformation -- it's a necessary link in this pathway, no?

      That was indeed a strange sentence. We have simplified it to the following: “LPTCs project to the inferior posterior slope[6], where they synapse with descending neurons[7,8]. In Drosophila at least 35 descending neuron types have their inputs in the posterior surface of the brain (named DNp1-35) [9].”

      (5) Figures:

      This is a formatting problem. The figure legends are separated from the figures, and there are no titles on the figures to indicate which one is which.

      We are sorry about this. We have added labels to the figures.

      Figure 1: What kind of geographic projections are these? The azimuth axis is not labeled.

      These stimuli were not perspective corrected, and therefore the RF maps simply reflect the visual monitor. We have clarified this in the figure legend, including mentioning that the axis label is the same for elevation and azimuth.

      Figure 2a: The error bars are not aligned to the angular axis.

      These have now been aligned.

      Supplement Figure 2b: I'm not sure why there are two measurements at each stimulus orientation. The bottom panel is confusing -- what do you mean by 'receptive field location'? And what does this red arrow/line mean in the bottom panel?

      Thank you for pointing this out. The figure was supposed to help the reader understand our transformations, so it’s great to know that it needed further explanation. To address this, we have added extra text and panel labels, please see lines 520 onwards.

      (6) Methods:

      Line 356: Maybe a picture or schematic drawing would be helpful to explain the setup. For instance, it's unclear what 32 degrees here refers to.

      This is a great suggestion, and a pictogram explaining the set-up can now be seen in Supplementary Fig. 6b.

      Line 404: What does it mean that 'spatially interpolate 10 times'?

      This sentence has been changed to “After subtracting the spontaneous rate, calculated for 0.8 s preceding stimulus onset (dotted line, inset, Fig. 1b, e), we interpolated the resulting local maximum responses to a ten-fold higher spatial resolution (colour coding, Fig. 1a, d).”

      Line 405: How to determine the center from the 50% contour?

      We have modified the Methods to explain how this was done, please see lines 478 onwards.

      Line 408: Please explain more explicitly how LPD and LMS are computed.

      We have modified the Methods to explain how this was done, please see lines 488 onwards.

      Line 418: Is reference 42 correct? I could be wrong, but this reference seems to be talking about target-selective DNs rather than optic flow-sensitive DNs?

      Yes, this reference is correct. In a supp figure to ref 42, we show data from optic flow sensitive neurons, but not their receptive fields. Thanks for checking.

      Line 426: Are the full-screen stimuli presented in 8 directions too? Do I understand correctly that the preferred direction vector for the full-screen stimuli is extracted from a cosine fit, which is slightly different from the 'receptive field preferred direction' in the receptive field mapping measurement, which is the median of all the 'local preferred direction' (which are from the cosine fit)?

      We have modified the text to make this clear, please see lines 519 onwards, as well as the receptive field analysis, please see lines 474 onwards.

    1. eLife Assessment

      This study provides valuable insight into the role of actin protrusions in mediating early pre-endoyctic steps of human papillomavirus entry at the cell surface. Using state-of-the-art microscopy in an immortalized keratinocyte model, the authors present convincing evidence that filopodia actively promote the transfer of heparin sulfate-coated virions from the extracullar matrix to the viral entry factor CD151. These findings provide a strong framework for future studies aimed at further resolving the dynamics of virion transfer and receptor engagement.

    2. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The editors have determined that the authors adequately addressed the prior reviewer comments.]

      Summary:

      The author's goal was to arrest PsV capsids on the extracellular matrix using cytochalasin D. The cohort was then released and interaction with the cell surface, specifically with CD151 was assessed.

      Note on previous revisions:

      The authors did an excellent job in their revision to include data from the effect of proteolytic priming on their observed virion transfer to the cell body. All other minor issues were addressed adequately.

      The work could be especially critical to understanding the process of in vivo infection.

    3. Reviewer #2 (Public review):

      Review of the previous version:

      The study design involves infecting HaCaT cells (immortalised keratinocytes mimicking basal cells of a target tissue) and observing virus localization with and without actin polymerization inhibition by cytochalasin D (cytoD) to analyze virion transfer from the ECM to the cell via filopodial structures, using cellular proteins as markers.

      In the context of the model system, the authors stress in the revised version the importance of using HaCaT cells as a relevant 'polarized' cell model for infection. The term 'polarized' is used in the cell biological literature for epithelial cells to describe a strict apical vs. basolateral demarcation of the plasma membrane with an established diffusion barrier of the tight junction. However, HaCat cells do not form tight junctions. In squamous epithelia, such barriers are only found in granular layers of the epithelium. The published work cited in support of their claims either does not refer to polarity or only in the context of other cells such as CaCo-2 cells.

      Overall, the matter of polarity would be important, if indeed the virus could only access cell-associated HSPGs as primary binding receptor, or the elusive secondary receptor via the ECM in the used model system (HaCaT cells), if they would locate exclusively basolaterally. This is at least not the case for binding, as observed in several previous publications (just two examples: Becker et al, 2018, Smith et al., 2008). With only a rather weak attempt at experimental verification of their model system with regards to polarity of binding, the authors then go on to base their conclusions on this unverified assumption.

      This is one example of several in the manuscript, where claims for foundational premises, observations, and/or conclusions remain undocumented or not supported by experimental data.

      Another such example is the assumption of transfer of the virus from ECM to the tetraspanin CD151. Here, the conclusions are based on the poorly documented inability of the virus to bind to the cell body, which is in stark contrast to several previous publications, and raises questions. Thus, association with CD151 likely occurs both from ECM derived virus AND virus that binds to cells, so that any conclusions on the mode of association is possible only in live cell data (which is not provided). Overall, their proposed model thus remains largely unsubstantiated with regards to receptor switching.

      There are a number of important additional issues with the manuscript:

      First, none of the inhibitors have been tested in their system for efficacy and specificity, but rely on published work in other cell types. This considerably weakens the confidence on the conclusion drawn by the authors.

      Second, the authors aim to study transfer from ECM to the cell body and effects thereof. However, there are still substantial amounts of viruses that bind to the cell body compared to ECM-bound viruses in close vicinity to the cells. This is in part obscured by the small subcellular regions of interest that are imaged by STED microscopy, or by the use of plasma membrane sheets. This remains an issue despite the added Supple. Fig. 1, where also only sub cellular regions are being displayed. As a consequence the obtained data from time point experiments is skewed, and remains for the most part unconvincing, largely because the origin of virions in time and space cannot be taken into account. This is particularly important when interpreting the association with HS, the tetraspanin CD151, and integral alpha 6, as the low degree of association could be originating from cell bound and ECM-transferred virions alike.

      Third, the use of fixed images in a time course series also does not allow to understand the issue of a potential contribution of cell membrane retraction upon cytoD treatment due to destabilisation of cortical actin. Or, of cell spreading upon cytoD washout. The microscopic analysis uses an extension of a plasma membrane stain as marker for ECM bound virions, this may introduce a bias and skew the analysis.

      Fourth, while the use of randomisation during image analysis is highly recommended to establish significance (flipping), it should be done using only ROIs that have a similar density of objects for which correlations are being established. For instance, if one flips an image with half of the image showing the cell body, and half of the image ECM, it is clear that association with cell membrane structures will only be significant in the original. But given the high density of objects on the plasma membrane, I am not convinced that doing the same by flipping only the plasma membrane will not also obtain similar numbers than the original.

    4. Author response:

      The following is the authors’ response to the previous reviews

      eLife Assessment

      This study provides valuable insight into the role of actin protrusions in mediating early pre-endoyctic steps of human papillomavirus entry at the cell surface. Using state-of-the-art microscopy in an immortalized keratinocyte model, the authors present mostly solid evidence that filopodia actively promote the transfer of heparin sulfate-coated virions from the extracullar matrix to the viral entry factor CD151. Remaining gaps in the mechanistic model could be further supported by including a more expansive analysis of the fixed microscopy samples and live cell imaging to distinguish virion transfer from direct binding.

      We thank the editorial team for the improved eLife assessment. Regarding the remaining gap, we agree that it is not clear why the large majority of the virions indeed are transferred and not directly binding virions.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The author's goal was to arrest PsV capsids on the extracellular matrix using cytochalasin D. The cohort was then released and interaction with the cell surface, specifically with CD151 was assessed.

      The model that fragmented HS associated with released virions mediates the dominant mechanism of infectious entry has only been suggested by research from a single laboratory and has not been verified in the 10+ years since publication. The authors are basing this study on the assumption that this model is correct, and these data are referred to repeatedly as the accepted model despite much evidence to the contrary. The discussion in lines 65-71 concerning virion and HSPG affinity changes is greatly simplified. The structural changes in the capsid induced by HS interaction and the role of this priming for KLK8 and furin cleavage has been well researched. Multiple laboratories have independently documented this. If this study aims to verify the shedding model, additional data needs to be provided.

      Comment of the authors: the above paragraph is copied from the very first review and describes the situation before revision.

      Note on revisions:

      The authors did an excellent job in their revision to include data from the effect of proteolytic priming on their observed virion transfer to the cell body. All other minor issues were addressed adequately.

      We are grateful that the referee acknowledges that we addressed all issues adequately.

      The work could be especially critical to understanding the process of in vivo infection. 

      We agree, and would like to point out that a similar comment was raised by the reviewing editor assigned to our original submission, John Schiller. For unknown reasons, he was no longer involved in the evaluation of the revision.

      Reviewer #2 (Public review):

      The study design involves infecting HaCaT cells (immortalised keratinocytes mimicking basal cells of a target tissue) and observing virus localization with and without actin polymerization inhibition by cytochalasin D (cytoD) to analyze virion transfer from the ECM to the cell via filopodial structures, using cellular proteins as markers.

      In the context of the model system, the authors stress in the revised version the importance of using HaCaT cells as a relevant 'polarized' cell model for infection. The term 'polarized' is used in the cell biological literature for epithelial cells to describe a strict apical vs. basolateral demarcation of the plasma membrane with an established diffusion barrier of the tight junction. However, HaCat cells do not form tight junctions. In squamous epithelia, such barriers are only found in granular layers of the epithelium. The published work cited in support of their claims either does not refer to polarity or only in the context of other cells such as CaCo-2 cells.

      We thank the reviewer for this important clarification and fully agree. HaCaT cells do not form tight junctions and therefore do not fulfill the classical definition of polarized epithelial cells with a strict apical basolateral diffusion barrier. In response to this comment, we have removed the term “polarized” in reference to HaCaT cells throughout the revised manuscript. Our intention was not to imply classical epithelial polarity, but rather to emphasize that HaCaT cells represent a functionally relevant keratinocyte model that recapitulates key early steps of HPV infection observed in vivo, particularly abundant ECM deposition enabling for strong virion binding to the ECM.

      We now state on line 120: “PsVs that bind to the ECM at sites distal from the cell body are unable to establish direct contact with entry receptors, until the cell migrates onto them or they are transported along cell protrusions towards the cell body (Schelhaas et al., 2008; Smith et al., 2008). Both cell migration and protrusion transport depend on actin dynamics (Schaks et al., 2019). We aimed for blocking these active recruitment mechanisms in HaCaT cells, a cell line that is widely used as a cell culture model for HPV infection. They resemble primary keratinocytes in several key aspects: they are not virally transformed and produce large amounts of ECM, promoting interactions between viruses and ECM components and thereby facilitating infection (Bienkowska-Haba et al., 2018; Gilson et al., 2020). In addition, subconfluent HaCaT cells form filopodia and filopodial transport is used for the recruitment of ECM-bound virus particles to the cell body (Schelhaas et al., 2008, Smith et al., 2008). Together, these features make HaCaT cells a suitable model for studying active PsV recruitment from the ECM to the cell surface.”

      Overall, the matter of polarity would be important, if indeed the virus could only access cell-associated HSPGs as primary binding receptor, or the elusive secondary receptor via the ECM in the used model system (HaCaT cells), if they would locate exclusively basolaterally.

      We apologize for not having stressed enough that virions bind as well directly to the not imaged, upper cell membrane. To make clear that HaCaT cells are still a suitable model for studying active recruitment, throughout the manuscript, we worked on the following issues (this is an outline, for details see below):

      (1) We now discuss adequately that virions reach cell surface receptors either by passive diffusion or by active transport mechanisms, the latter involving actin dynamics (filopodial transport and cell migration), to which we refer in the revised manuscript as active recruitment.

      (2) We explain why the large majority of virions in the microscopic assay are actively recruited virions.

      (3) We explain the difference between biochemical infection assays that do not differentiate between passive and active recruitment, and microscopic assays studying the basal cell membrane and by this primarily actively recruited virions

      This is at least not the case for binding, as observed in several previous publications (just two examples: Becker et al, 2018, Smith et al., 2008). With only a rather weak attempt at experimental verification of their model system with regards to polarity of binding, the authors then go on to base their conclusions on this unverified assumption.

      We agree with the reviewer that strict epithelial polarity would only be relevant if HPV binding or receptor accessibility were confined to the basolateral membrane, which is not the case in HaCaT cells, as shown previously (e.g., Becker et al., 2018; Smith et al., 2008). However, our conclusions do not rely on strictly polarity-dependent binding.

      We added the following paragraphs clarifying that (i) in HaCaT cells PsVs also bind by passive diffusion to the upper cell membrane and that (ii) at the basal membrane the large majority of imaged PsVs is actively recruited.

      Line 332: “…, the lower PCC at 0 min/CytD suggests that without active recruitment less PsVs reach CD151. At 30 min after CytD, the PCC has reached the level of 0.1 as in the control, which is in line with the idea of fast recruitment as observed in Figure 4. To follow how the basal cell membrane is populated with PsVs over time, as additional analysis we determined the PsVs per µm<sup>2</sup> in ROIs placed in the cell body region. At 0 min, CytD reduces the PsV density to 19 - 33%, albeit the effect is not significant, and at 180 min/CytD the same PsV density as in the control is reached (Supplementary Figure 6A and B). Overall, under CytD there was a trend towards less PsVs present (Supplementary Figure 6A and B). Hence, both Figure 5C and Supplementary Figure 6A and B suggest that active virion transport is required to reach efficiently the basal membrane.”

      Line 447: “Throughout all experiments, we observe at 0 min/CytD only few PsVs at the basal membrane (Figure 1A, Supplementary Figure 6A and B; see also PCC at 0 min between PsVs an CD151 in Figure 5C), suggesting that in the absence of active recruitment the access to the basal membrane via passive diffusion is limited. We wondered, how many PsVs may bind to the cell membrane without a diffusion barrier? For this reason, we incubated EDTA detached HaCaT cells in suspension with PsVs for 1 h at 4 °C, followed by re-attachment for 1 h. Under these conditions, we find, despite of a shorter incubation time (1 h versus 5 h), a roughly 3-fold larger PsV density (1.7 PsVs/µm<sup>2</sup> (Supplementary Figure 6D)) than the highest density observed in the other experiments. However, it should be noted that values of the different experiments cannot be directly compared. Aside from the different treatments, another difference lies in the size of the imaged membrane. The re-attachment of cells is not complete after 1 h (compare size of adhered membranes in Supplementary Figure 6A and 1A), wherefore the membranes are likely strongly ruffled, which results in the underestimation of the membrane area. As a result, we overestimate the PsVs per µm<sup>2</sup> adhered membrane (please note that we cannot re-attach cells for longer times as we then lose PsVs due to endocytosis). In any case, the experiment suggests that PsVs bind more efficiently to membrane surface receptors without a diffusion barrier. We conclude that in our assay PsVs cannot readily bypass the active PsV recruitment by diffusing directly to the basal cell membrane, which is plausible, because to make this happen a 55 nm large PsV must diffuse through the narrow gap between glass-coverslip and adhered cell.”

      Line 538: “The analyzed PsVs hardly bind to the basal cell surface directly by diffusion (Supplementary Figure 6, compare PsV maxima density at 0 min/CytD in A and B to C). Therefore, the actin-driven virion transport would play a decisive role in HPV infection if cells would form a monolayer with a disruption at which ECM is present and that is approached by PsVs, a scenario similar to in vivo infection. In addition, cell migration could establish contact between PsVs and the cell surface.”

      Line 548: “…that can readily bind to the upper cell membrane. We are not aware of a PsV translocation mechanism from the upper to the basal membrane. Therefore, in our assay, PsVs bound to the upper membrane are not expected to show up at the basal membrane. Comparing 0 min of control and CytD (Supplementary Figure 6A and B), we find that compared to the control 19 - 33% of the PsVs reach the basal membrane in the absence of active transport, or in other words, most likely by passive diffusion. Actually, the range from 19 – 33% must be a strong overestimate as PsVs in the control are in transit and many actively recruited PsVs are already internalized during the 5 h incubation period. For this reason, we propose that most likely much less than 10% of the PsVs reach the basal membrane by diffusion. Moreover, in the absence of the diffusion barrier, the density of bound PsVs is strongly increased (Supplementary Figure 6D), showing indirectly that at the basal membrane the binding sites are difficult to access without active recruitment. Taken together, we propose the large majority of PsVs analyzed in our assay are ECM bound and actively recruited to the basal cell membrane.”

      This is one example of several in the manuscript, where claims for foundational premises, observations, and/or conclusions remain undocumented or not supported by experimental data.

      Another such example is the assumption of transfer of the virus from ECM to the tetraspanin CD151. Here, the conclusions are based on the poorly documented inability of the virus to bind to the cell body, which is in stark contrast to several previous publications, and raises questions.

      We hope with the above changes we made clear that virions can also directly bind to the cell body. We also added a paragraph discussing differences between biochemical and microscopic assays.

      Line 568: “In this scenario, sub-confluent HaCaT cells, or even better single HaCaT cells, would be an ideal model system for the microscopic study of these very early infection steps that involve ECM attachment and subsequent active recruitment, as supposed to occur during in vivo infection of basal keratinocytes after binding of virions to the basement membrane (Bienkowska-Haba et al., 2018; Day and Schelhaas, 2014; Kines et al., 2009; Schiller et al., 2010). In contrast, in biochemical infection assays, virions diffusing to HSPGs on the cell surface, and by this bypassing active recruitment, are assayed together with the actively recruited virions. Should cells secrete little ECM and are grown to confluency, the passively binding virions are supposed to strongly dominate the infection rate in a biochemical infection assay.”

      There are a number of important additional issues with the manuscript:

      First, none of the inhibitors have been tested in their system for efficacy and specificity, but rely on published work in other cell types. This considerably weakens the confidence on the conclusion drawn by the authors.

      We use inhibitors CytD, blebbistatin, leupeptin and furin inhibitor I. The below references are examples reporting the usage of the inhibitors on HaCaT cells studied in the context of HPV infection.

      Furin inhibitor I:

      Cruz et al., Cleavage of the HPV16 Minor Capsid Protein L2 during Virion Morphogenesis Ablates the Requirement for Cellular Furin during De Novo Infection. Viruses, 2015; doi.org/10.3390/v7112910

      Cytochalasin D/Blebbistatin:

      Schelhaas et al., Human papillomavirus type 16 entry: retrograde cell surface transport along actinrich protrusions. PLoS Pathog., 2008. doi: 10.1371/journal.ppat.1000148.

      Smith et al., Virus activated filopodia promote human papillomavirus type 31 uptake from the extracellular matrix. Virology, 2009; doi.org/10.1016/j.virol.2008.08.040 and

      Leupeptin/Furin inhibitor I:

      Cerqueira et al., Kallikrein-8 Proteolytically Processes Human Papillomaviruses in the Extracellular Space To Facilitate Entry into Host Cells. J. Virology, 2015; doi.org/10.1128/jvi.00234-15

      Moreover, the reversible inhibitory effect of CytD the key inhibitor, used in this study on transport and infection is validated in this study. However, we discuss this data now in the context of directly binding virions more critically.

      Line 485: “Hence, the infection assay suggests that the treatment is largely reversible and only slightly harmful, if at all. However, the luciferase infection assay does not distinguish between actively recruited PsVs and PsVs that bind passively by diffusion to the upper membrane. The latter fraction likely dominates the total infection rate and should be less affected by CytD than the fraction of actively recruited PsVs. Therefore, if the infection pathway of a small fraction of actively recruited PsVs is irreversibly inhibited, we may not be able to detect this effect on the background of unaffected passively binding PsV.”

      Second, the authors aim to study transfer from ECM to the cell body and effects thereof. However, there are still substantial amounts of viruses that bind to the cell body compared to ECM-bound viruses in close vicinity to the cells.

      Regarding direct binding to the cell body, please see our detailed reply above.

      This is in part obscured by the small subcellular regions of interest that are imaged by STED microscopy, or by the use of plasma membrane sheets. This remains an issue despite the added Supple. Fig. 1, where also only sub cellular regions are being displayed. As a consequence the obtained data from time point experiments is skewed, and remains for the most part unconvincing, largely because the origin of virions in time and space cannot be taken into account. This is particularly important when interpreting the association with HS, the tetraspanin CD151, and integral alpha 6, as the low degree of association could be originating from cell bound and ECM-transferred virions alike.

      We hope with the above explanations it is plausible that the imaged virions primarily reach the basal membrane by active recruitment.

      Third, the use of fixed images in a time course series also does not allow to understand the issue of a potential contribution of cell membrane retraction upon cytoD treatment due to destabilisation of cortical actin. Or, of cell spreading upon cytoD washout. The microscopic analysis uses an extension of a plasma membrane stain as marker for ECM bound virions, this may introduce a bias and skew the analysis.

      The referee is correct in pointing out that cell spreading after CytD wash off would affect our analysis, e.g. by increasing the overlap between PsVs and the cell body although no active recruitment via filopodial transport and cell migration occurs. An argument speaking against this possibility is the lack of increase in the PCC between PsVs and F-actin after CytD removal, if the protease inhibitor leupeptin was present (Figure 2B and D). Leupeptin prevents PsV/phalloidin overlap despite restored actin polymerization after washout of both inhibitors, suggesting that priming is required for increased PsV–actin association and is too slow to change PCC within 60 min. These results support that the observed overlap reflects active, priming-dependent recruitment rather than cell morphology changes.

      We state on line 252: “Moreover, the experiment suggests that without PsV priming the PCC between PsV-L1 and F-actin does not increase, for instance, due to cell spreading after CytD removal.”

      On line 494, we state “However, we assume that this is rather unlikely, as cell spreading would increase the PCC between PsVs and F-actin under a condition where PsVs are not-primed (and therefore not actively recruited) but cell spreading occurs, which is not the case in Figure 2B and D (CytD/leupeptin).”

      Fourth, while the use of randomisation during image analysis is highly recommended to establish significance (flipping), it should be done using only ROIs that have a similar density of objects for which correlations are being established. For instance, if one flips an image with half of the image showing the cell body, and half of the image ECM, it is clear that association with cell membrane structures will only be significant in the original. But given the high density of objects on the plasma membrane, I am not convinced that doing the same by flipping only the plasma membrane will not also obtain similar numbers than the original.

      Regarding the association of PsVs with CD151 and HS, we corrected for random background with reference to a calibration line that describes the random background association in dependence of the density of objects. We now refer to this issue on line 343: “…, the fraction of PsVs closely associated with CD151 is around 10% (Figure 5D, control), after correction for random background association, for which we used a calibration line based on the same density of PsVs in flipped images (see Supplementary Figure 7).”

      In the legend of Supplementary Figure 7 we state: “…The fraction of closely associated PsVs (PsV-L1 maxima with a distance ≤ 80 nm to the next nearest CD151 maximum) in the Control of Figure 5 was analyzed on original and flipped images (for an example of a flipped image see Supplementary Figure 5A)…on flipped images, we often find values more than half of the values of the original images, demonstrating that many PsVs have a distance ≤ 80 nm to CD151 merely by chance, in the following referred to as background association…We take the altogether 24 fraction values obtained on flipped images (12 values from Control and CytD each), and plot the fraction of closely associated PsVs against the average CD151 maxima density in the respective images. As can be seen in (C), the fraction increases with the maxima density, as the chance of a distance ≤ 80 nm increases with the maxima density. The fitted linear regression line describes how the background association depends from the maxima density. As a result, the background association (y) can be calculated for any maxima density (x) with the equation y = 2.04 • x. The CytD/0 min condition may be overcorrected, if it includes many images with CD151 flipped onto peripheral PsVs that actually are distal to CD151 (for an example ROI see Supplementary Figure 5A). On the other hand, PsVs right at the cell border, where CD151 staining tends to be strong (Supplementary Figure 5A), after flipping have less CD151 than before, contributing to undercorrection.”

      When omitting the CytD/0 min values, we obtain essentially the same calibration line.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      There are further issues that are not pertaining to the study design that I find important.

      Fig.1

      There are few, if any, filopodia in untreated cells. It would be good to quantify their abundance to substantiate that resting HaCat cells are indeed a good model for filopodial transport bs. membrane retraction / spreading.

      We see filopodia in untreated HaCaT cells (although quite variable in abundance, please see control cells in e.g. Figure 3 and 8 and Supplementary Figure 2).

      In HaCat ECM the virus binds also to laminin-332 for a good part. Would this not also confound the analysis?

      We agree with the reviewer that in HaCaT-derived ECM, virus binding is not restricted to heparan sulfate (HS), and that laminin-332 represents an additional relevant binding partner. Indeed, viruses bound to laminin-332 may likewise be transported toward the cell body via laminin-binding integrins. We therefore consider laminin-332 to act as a parallel attachment factor alongside HS rather than as a mutually exclusive alternative.

      However, the primary aim of this study was not to comprehensively map all ECM binding partners, but to analyze the actin-dependent transport of ECM-bound virus particles. HS was chosen as a representative and well-characterized ECM marker for initial virus attachment. Importantly, inhibition of actin dynamics by cytochalasin D blocks this transport process downstream of initial binding. Thus, irrespective of whether the virus is initially bound to HS, laminin-332, or both, the readout reflects interference with the same actin-dependent transport mechanism.

      Consequently, the presence of laminin-332 binding does not confound our analysis, as the experimental outcome is determined by inhibition of transport rather than by the specific ECM attachment factor. Nonetheless, we acknowledge laminin-332 as an important parallel interaction partner and had already mentioned it the first version of the manuscript, but removed the sentence during the last revision, that has now been added again. On line 593 we state: “Finally, not all PsVs bound to the ECM are expected to bind to HS but could also bind to laminin 332 (Culp et al., 2006).”

      Fig.2

      Would benefit from live cell analysis. There are considerable amounts of virions on the cell body, which partially contradicts statements from Fig. 1. The fast transfer to the cell body after cyto D washout is based on the assumption that filopodia formation and transport along them (and not membrane extension) occurs quickly. Is this reasonable? Does membrane extension and migration occur between 0 min and later time points?

      Regarding membrane extension after CytD removal, that in the analysis may be indistinguishable from active recruitment transfer, please see our reply above (no PCC increase between PsV-L1 and F-actin after CytD removal if leupeptin is employed). Regarding migration, we now included this possibility as an active recruitment mechanism that may occur in parallel to filopodial transport (please see our reply above).

      Fig.4

      How are the subcellular ROIs chosen? Is there not a bias by not studying a full cell?

      In Figure 4 we are specifically interested in the time course of PsV diminishment from the cell periphery. The ROIs are generated with reference to the membrane staining, using the cell body delineation as a starting point. For details about how ROIs are generated, please see legend of Figure 4 and materials and methods.

      Fig. 5/6

      The data needs a better analysis on correlation by using randomisation as explained above.

      Please see our reply above. The association between PsVs and CD151 or HS has been corrected using a calibration line based on the same density of objects.

      Fig. 8. Why does blebbistatin block the transport only partially? Previous work on actin retrograde flow suggests that in the absence of myosin II function the transport stops completely. Would this not be a concern, when interpreting the city D data?

      Is the referee referring to Schelhaas et al., 2008 that we cite in the paper? In this paper, in HeLa cells blebbistatin reduced the directed particle motion by 82%, but not completely.

      Suppl. Fig. 1A, B: Intented to adress the issue of viruses binding to the cell body, it unfortunately falls short. It would have been better to analyse complete cells rather than ROIs, or better even, a comprehensive analysis of cell islets (boundary cells vs. central cells, with cell body to cell periphery).

      This experiment addresses the increase in PsV density resulting from active recruitment. Outlining entire cells would include also PsVs close to the cell edge that have not been actively recruited.

      Regarding cell islets (we call them patches of confluent cells as islets may be confused with e.g. more structured Langerhans islets), there are hardly any PsVs at the basal membrane. We state on line 135: “Frequently, we observe patches of confluent cells which are common to HaCaT cells. Cells at the center of these patches are dismissed during imaging, because hardly any PsVs are bound to their basal membrane, indicating that PsVs do rather not reach this area by passive diffusion. Instead, we focus on isolated HaCaT cells or cells at the periphery of cell patches. At these cells, we find more PsVs per cell than one would expect from the employed ≈ 50 viral genome equivalents (vge) per cell, indicating that PsVs are unequally distributed between the cells.”

      Is the difference between untreated and cytoD treated significant?

      We stated in the Figure legend that the difference is not significant (the exact p value is p = 0.089). We now have revised the Figure (previously Supplementary Figure 1A and B, now Supplementary Figure 6A and B), showing the PsV density at the basal membrane over time, also for the experiment shown in Figure 6. The now revised Figure (Supplementary Figure 6A and B) is discussed together with the re-attachment experiment (Supplementary Figure 6C and D), in order to compare the PsV accessibility to the cell membrane with and without diffusion barrier. Please see our reply above (paragraph starting at line 447).

    1. Author response:

      We are particularly encouraged by the consensus that our study provides a substantial resource and that the bioinformatic framework is biologically grounded and convincing, while appropriately noting that further experimental validation will be required. We fully agree with this point. As clarified in the revised manuscript, the lineage relationships we describe are inferred from integrative transcriptomic analyses and are intended to provide a mechanistic and conceptual framework rather than definitive proof of cellular origin. We have further strengthened the Discussion to explicitly acknowledge these limitations and outline future directions, including lineage tracing and functional validation studies.

      At the same time, we respectfully note that such experimental validation would require a substantial extension of this work and likely 2–3 years of additional studies, including development of appropriate model systems. We believe these efforts represent an important next phase of investigation rather than a revision-level addition to the current manuscript. Our primary goal here is to present a high-resolution human transcriptomic resource and a coherent framework that identifies biologically plausible epithelial intermediates linking normal fallopian tube hierarchy to malignant states.

      Given the reviewers’ positive evaluation and recognition of the value and rigor of the dataset and analyses, we respectfully request consideration to proceed with publication as an eLife Version of Record without further experimental revision. We believe that the timely dissemination of these findings will provide a useful resource for the field and help guide the experimental studies needed to test the hypotheses generated here.

    2. Reviewer #2 (Public review):

      Summary:

      The authors used single-nuclei sequencing of benign fallopian tubes and ovarian cancer to delineate the plausible cell of origin of high-grade serous ovarian cancer.

      Strengths:

      These substantial data provide the field with significant research resources to examine additional features in normal fallopian tubes and ovarian cancers. The highly detailed bioinformatic analysis, rooted in a strong biological framework, is convincing. The methodology was appropriate and used validated methodology based on biological relevance (region selection and transcriptomics analysis).

      The authors propose a convincing model of epithelial progenitor cells and their localisation in high-grade serous ovarian cancers. These findings are important and useful.

      Weaknesses:

      Overall, the weaknesses are clearly stated in the discussion. The study provides a novel framework for future study, and proposes a model which will require validation.

      Within the ovarian cancer field, the endometrioid and clear cell histotypes are thought to arise from ciliated or secretory cells. Typically these are thought to be from the cervix or uterus. This concept was not mentioned in the work.

      Further, in the ovarian cancer field, stemness is judged by some classic assays - aldehyde assays looking at ALDH1A1 and spheroid-producing ability. These were not mentioned - could these be useful in a population of fallopian tube epithelial cells, or would other assays/markers be more appropriate?

      The choice of ES2 and OVCAR was not sufficiently justified, as ES2 is widely regarded as a clear cell ovarian cancer cell line in many research circles. Additionally, I did not see confirmation of gene knockdown by Western blot or qPCR.

      PGR loss through copy number variant was surprising, as this was a marker. So would the marker be lost through one of these mechanisms randomly or specifically?

    3. Reviewer #1 (Public review):

      Summary:

      Using comprehensive profiling of normal and cancerous tissue via bulk and single-cell RNA sequencing, the authors identified that high-grade serous ovarian cancer is likely to originate from the epithelial progenitor cells from the distal fimbrial region of the fallopian tube, where it has been previously shown to be most prone to ovulatory stress and other microenvironmental influences. The authors also included a CNV analysis to identify hotspots in HGSOCs.

      The findings are preliminary, but the resource on its own has great potential and can be used for developing methods for early detection, stratification and treatment.

      The main limitation of this study is that the lineage is purely inferred from bioinformatics analysis. More validation work is required, perhaps using cell models / other model organisms.

      Strengths and weaknesses:

      The authors investigated the origin of high-grade serous ovarian cancer, which is one of the deadliest. They performed comparative analysis using both bulk and single-nucleus RNA sequencing between cancerous and normal tissues (fallopian tube and ovaries) and identified a population of epithelial progenitor cells from the distal fimbrial region that are exposed to ovulatory stress, as the most plausible cells of origin. The extensive profiling of the molecular signatures can also be used for early detection and stratification for treating the disease.

      Previous studies have shown that HGSOCs likely originated from the epithelial lining of the fallopian tubes (PMID 32349388). The bulk RNAseq data is confusing in that neither the overall correlation of the transcriptome nor the sample clustering (Figure 1) supports the idea that the HGSOCs are close to the fallopian tube. The authors could perform a more comprehensive marker gene-based analysis to demonstrate their relationship.

      The authors also performed a comprehensive analysis of single-cell datasets on both normal and cancerous tissue in humans. From there, they performed a combination of RNA velocity, PAGA and pseudotime, etc, to try and delineate the relationship amongst related cell populations. It would be helpful if the authors could clarify why they applied this particular suite of tools (explaining the differences between tools and bioinformatic approaches) to assist the broader readership who may not be familiar with this type of single-cell bioinformatic analysis.

      It also seems to me that the authors did not account for patient effect when they performed the data integration (this point is discussed in the text). This may explain at least partially why the clusters are segregated by patient samples. Another explanation is that it could be due to uneven sampling, as only very few cells (1000s) were captured from each of the tumour samples, and this is clear when a dramatic difference can be seen in their cellular composition.

      The trajectory analysis of normal and cancer single-cell data should also include other cells to prevent confirmation bias, as these analyses would only consider relationships amongst the cells available in the model.

      As the authors indicated in the limitations, the cell lineage in the studies is largely inferred from the bioinformatics analysis. Experimental lineage tracing via other experimental models (organoids/animal models) would be required.

      Despite these limitations, this study will serve as an important resource for the scientific community. I would also suggest that the authors should share this resource via additional portals in addition to the GEO data deposit (e.g. the HCA, or single-cell portals such as at the Broad Institute or CellXGene Discover).

    4. eLife Assessment

      This valuable study reports a substantial single-cell RNAseq and bulk RNAseq dataset from multiple high-grade serous ovarian cancers, including a single-cell atlas of human fallopian tube epithelium. The bioinformatic analysis investigating the lineage and location of epithelial progenitor cells is convincing, although this will require experimental validation. The work also provides a resource to examine additional features of normal fallopian tubes and ovarian cancers, and for developing methods for early detection and tumour stratification.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Weaknesses:

      In my view, the presentation of the data is in some cases not ideal. The phrasing of some conclusions (e.g., group-attacks and wolf-pack-hunting by the bacteria) is in my opinion too strong based on the herein provided data.

      We agree with your comment and have replaced the terms “Group-attacks” and “wolf-pack-hunting by “attacks” throughout the manuscript.

      Reviewer #1 (Recommendations for the authors):

      (1) Figure 2AB, please add the name of the statistical test and the number of replicates that the data is based on to the figure legend.

      We thank Reviewer#1 for highlighting the need for more detail. We have revised the manuscript accordingly. The captions of figures 2, 3, 4 and S1 were revised to include the name of the statistical test and the number of replicates. Asterisks indicate significant differences in a multiple comparison test (One -way ANOVA with post hoc Tukey test),* P ≤ 0.05, ** P≤0.01, *** P≤ 0.001

      (2) Figure 2C is this figure referred to in the text?

      We apologize for this oversight. Figure 2C was replaced by new figures 2C and 2D and the old figure 2C is now referenced in the manuscript as Fig 3B1.

      (3) Movie 1, could the movie please also be provided as .mp4? I suggest including individual images across time in the main figure so that readers do not rely on opening a supplementary file for this key finding of the study.

      In the revised manuscript, all the videos were converted to mp4 format and individual images across time were included in Figure 2C and 2D (Chronological snapshots of one attack) and in figure 3B1 (Chronological snapshots of the complete event), thereby improving the readability of the manuscript.

      (4) Figure 3A2 (text l. 355), I am afraid I do not find this figure.

      Fig. 3A2 which previously corresponded to Fig. 3B1, correspond now to Fig. 2C and Fig. 2D. This has been corrected in the revised version of the manuscript.

      (5) Lines 356ff, I am afraid that I find it hard to follow what the authors refer to as the right cell or the left cell. I suggest either adding labels to the movies or providing individual images across multiple timepoints into the main figure that can be labelled and bring across the point.

      Arrows have been added to videos 3–5 to clearly indicate the cells referred to in the text and facilitate tracking across time.

      (6) In general, for all the microscopy, on how many cells have these phenomena been observed? What is n=x? Has this been quantified?

      We thank the reviewer for pointing this out.

      In caption of Fig. 3, the sentence “(A) Percentage of motile A. pacificum ACT03. (B) A. pacificum ACT03 attacked by V. atlanticus LGP32 and (C) A. pacificum ACT03 lysis after 0, 15, 30, 45 and 60 min of interaction. “was replaced by “(A) Cumulative percentage of motile A. pacificum ACT03 cells. (B) Cumulative number of cells attacked by V. atlanticus LGP32 and (C) Cumulative cell lysis after 0, 15, 30, 45 and 60 minutes of interaction.”. In Fig. 3 caption, the sentence “All percentages were determined based on a minimum of 2,000 cells of A. pacificum ACT03.” was also added.

      In Fig. 4 caption, the sentence “All percentages were determined based on a minimum of 2,000 cells of A. pacificum ACT03.” was added.

      In Fig. S1 caption, the sentence “All percentages were determined based on a minimum of 2,000 cells of A. pacificum ACT03.” was added.

      (7) Figure S1A, does this figure show means plus/minus standard deviation? If yes, please add this to the figure legends.

      In Fig. S1 caption, the sentence “Error bars represent the standard deviation of the mean of three independent experiments” was added.

      How do the authors explain the big variation in the test condition and not in the control?

      Regarding the higher variation observed in the test condition compared to the control, this may, on the one hand, reflect biological variability between independent batches of 60-h V. atlanticus cultures used to prepare the supernatants, and, on the other hand, a heterogeneity in the physiological status of independent algal batches (N = 3 ; 2 × 10^4 cells ; see Materials and Methods, Co-culture assay), which may not be perfectly synchronized . In contrast, the control condition consists of A. pacificum cultures incubated in fresh medium without bacterial supernatant, for which algal motility is highly reproducible and thus shows very little variation.

      (8) Line 375, "The lysis phase corresponded to initial vesicle formation followed by the bursting of A. pacificum ACT03 cells (Movie 5) and was induced by the old-starved culture supernatant of V. atlanticus LGP32 (Fig. S1)." Is this reference to Figure S1 correct? S1 shows motility, doesn't it? I don't see how this data supports the statement made in this sentence.

      We apologize for this unclear message.

      "The lysis phase corresponded to initial vesicle formation followed by the bursting of A. pacificum ACT03 cells (Video 5) and was induced by the old-starved culture supernatant of V. atlanticus LGP32 (Fig. S1)." was replaced by "The lysis phase corresponded to initial vesicle formation followed by the bursting of A. pacificum ACT03 cells (Fig. 3C and 3C1).

      And “We next tested whether this lytic effect was mediated by thermostable molecule (s) secreted by Vibrio. “was replaced by “We next tested whether this lytic effect was linked to Vibrio culture supernatant and mediated by thermostable molecule (s) secreted by Vibrio.

      (9) Line 388ff, "Group attacks were observed on non-degraded A. pacificum ACT03 cells, but not on previously lysed cells." No reference to a figure is provided. I am afraid I don't see the data that this statement is based on.

      As it is impossible to show a lack of attack, we just clarified the basis of our experiment.

      “To this end, A. pacificum ACT03 in exponential growth phase was first exposed for 30 minutes to the supernatant of a 60-hour culture of V. atlanticus LGP32, which induced 25% lysis of A. pacificum ACT03 cells. Next, the corresponding V. atlanticus LGP32 cells were added. During exposure, attacks were observed only on undegraded A. pacificum ACT03 cells, but not on previously lysed cells” was replaced by “To this end, A. pacificum ACT03 in exponential growth phase was first exposed for 30 minutes to the supernatant of a 126-hour culture of V. atlanticus LGP32, which induced lysis of 70% of the A. pacificum ACT03 cells (Figures 3C and 3C1, arrow 2 and video 4). Next, cells of V. atlanticus LGP32 from a 60-hour culture, capable of attacking A. pacificum ACT03 cells (Fig. 3B), were added. For 1 hour of exposure, no attack was observed on the previously lysed algae.”

      (10) Figure 4a, Based on the labeling of the figure, in particular the x-axis, it is not fully clear to me what I am looking at.

      Figure 4A has been reworked and its legend modified. We hope that this graph is clearer now.

      (11) Line 428, did the authors consider complementing the pvuD deletion mutant and testing for gain of function when providing the gene in trans?

      We did not investigate pvuD in this study and did not construct a pvuD deletion mutant. We therefore assume that the recommendation refers to pvuB, which was the focus of our work. Unfortunately, we did not perform this experiment. However, several lines of evidence support the implication of PvuB and the vibrioferrin uptake system in this process: (i) the loss of attack behaviour is specific to the mutant in the vibrioferrin uptake pathway and (ii) our expression and proteomic data show a strong induction of vibrioferrin uptake components under starvation and iron-manipulated conditions, which correlate with the attack phenotype.

      (12) Use of the term "group attack" in parentheses in the text, but in the section header and title. Is there really sufficient actual data to say that this is a "group attack"? What exactly are the indications for this being a behaviour of a group?

      We agree with you. The terms “group attacks” and “wolf-pack hunting” were replaced by the more neutral term “attacks” throughout the manuscript.

      (13) Table S1 and S2, those tables give a nice overview. Do the authors provide the raw data based on which they make a claim on "+" and "-" in the individual categories? I would prefer to see the actual data or at least have the possibility to look into this.

      In the revised versions of Tables 1 and 2, we have improved the captions and clarified the meaning of each column in order to avoid any ambiguity between the results of this study and the bibliographic information.

      Specifically regarding Table 2 :

      We do not present any visuals of the interaction between Vibrio and Alexandrium because these species all look alike. Regarding the other algae species tested in interaction with Vibrio, phenomena other than lysis or cell attack have been observed and are the subject of specific laboratory studies.

      (14) Line 456 "first study", line 40f "first evidence of a new mechanism". I suggest toning this down a bit and being clearer in the abstract about this being a working model that can be suggested based on individual bits of data.

      We thank Reviewer #1 for this helpful suggestion.

      In the summary:

      “This is the first evidence of a new mechanism that could to be involved in regulating Alexandrium spp. blooms and giving Vibrio a competitive advantage in obtaining nutrients from the environment.” was replaced by “The interaction model we propose here suggests that Vibrio could play a role in regulating the proliferation of Alexandrium spp., giving it a competitive advantage in obtaining nutrients from the environment.”

      In the discussion:

      Considering predator as a free organism that feeds at the expense of another, this study is the first evidence of the capacity of some Vibrio to develop a predatory strategy against an alga. This behaviour differs from parasitism, because the survival of Vibrio is not exclusively dependent on algae in environment” was replaced by “Consider a predator as a free-living organism that kills its prey and feeds on it, this study provides data suggesting the ability of Vibrios to develop an original predator-like behaviour to kill and feed on algae.”

      (15) Line 469 "Overall, these observations show that V. atlanticus LGP32 is able of wolf-pack hunting behaviour." I see the similarities. I feel that the term "show" is a bit too strong here, or I suggest referring to "wolf-pack-like behaviour".

      The sentence “Overall, these observations show that V. atlanticus LGP32 is able of wolf-pack hunting attack behaviour” was replaced by “Overall, these observations suggest that V. atlanticus LGP32 can exhibit a predator-like behaviour”

      Reviewer #2 (Public review):

      As Weaknesses Reviewer #2 include:

      (1) A lack of early, clear definitions for several important terms used in the paper, including 'predation', 'coordination' and 'coordinated action', 'group attack', and 'wolf-pack hunting', along with a corresponding lack of criteria for what evidence would warrant use of some of these labels. (For example, does mere simultaneity of attacks of an A. pacificum cell by many V. atlanticus cells constitute "coordination"? Or, as it seems to us, does coordination require some form of signalling between predator cells?)

      The term “Coordinate” was replaced by “simultaneous” throughout the manuscript

      The terms “Group attack” and “wolf pack hunting” were replaced by “attack” throughout the manuscript

      (2) Absence of controls for cell density in the test for starvation effects on predatory behaviour; unclear how the length of incubation affects the density of V. atlanticus cells.

      We thank the reviewer for pointing this out.

      Cells density experiment was already performed (cf. Fig. 4A).

      The sentence. ”All percentages were determined based on a minimum of 2,000 cells of A. pacificum ACT03.“ was added in captions of Fig. 3, Fig. 4 and Fig S1

      (3) Lack of clarity in some of the methodological descriptions

      The Methodology has been checked and some improvements have been made.

      Reviewer #2 (Recommendations for the authors):

      (A) Title

      (1) Could 'induces' be better than 'promotes'?

      We agree with Reviewer #2. The initial title, “Starvation of the bacterium Vibrio atlanticus promotes lightning group-attacks on the dinoflagellate Alexandrium pacificum”, was replaced by “Starvation of the bacterium Vibrio atlanticus induces simultaneous attacks on the dinoflagellate Alexandrium pacificum”.

      (B) Abstract

      (1) Perhaps define pycosphere in the abstract - many readers might not know this word.

      We have revised the abstract to define the term phycosphere and added the sentence “This occurs in the microenvironment surrounding phytoplankton cells, the phycosphere. An interface rich in nutrients and organic molecules exuded by the cell.”

      (2) Perhaps "on dinoflagellates".

      We thank Reviewer #2 for this suggestion. We have revised the abstract by replacing “on the dinoflagellates species” with “on dinoflagellates”.

      (3) Line 33 - The word 'prey' is used without a claim of predation having yet been made; only killing has been claimed so far.

      We agree and have replaced the word “prey” by “algae” in the abstract.

      (4) Line 34 - It is unclear whether the description refers to the 'attack stage' or to 'wolf-pack attack' in general. The sentence is written in such a way that it seems to refer to 'wolf-pack attack'. However, this would seem to be incorrect, with the description being specific to V. atlanticus.

      To avoid this ambiguity, we have removed the sentence “resembles the ‘wolf-pack attack’ strategy” from the abstract.

      (5) Line 35 - Should there be a 'consumption phase'?

      We agree with the reviewer #2, “degradation” was replaced by “consumption”.

      (6) If predation is claimed later in the manuscript (which it is), it should be explicitly claimed in the abstract.

      We thank Reviewer #2 for this helpful suggestion.

      We have revised the abstract. The sentence “Results showed that Vibrio atlanticus was able to coordinate lightning group attacks then kill the dinoflagellate Alexandrium pacificum ACT03” was replaced by “The results showed that Vibrio atlanticus was capable of attacking and killing the dinoflagellate Alexandrium pacificum ACT03”.

      (C) Main text

      (1) Line 54 - Perhaps "Among HAB-causing organisms...".

      We agree with the reviewer’s suggestion and have revised the wording.

      (2) Line 56 - "that, together with..., form the "Alexandrium tamarense" complex".

      We agree with the reviewer’s suggestion and have revised the sentence.

      (3) Line 57 - What this "complex" is and its significance should be explained.

      “Among them, Alexandrium pacificum is a flagellated eukaryotic unicellular organism that together with Alexandrium tamarense and Alexandrium fundyense form the "Alexandrium tamarense" complex (Hadjadji et al., 2020)” was replaced by

      “Among them, Alexandrium pacificum is a flagellated eukaryotic unicellular organism that together with Alexandrium tamarense and Alexandrium fundyense form the "Alexandrium tamarense" complex, responsible for paralytic shellfish poisoning worldwide (Hadjadji et al., 2020)”

      (4) Line 58 - What is a Rephy survey?

      We clarified this point, “by rephy survey” was replaced by “by the French phytoplankton observation and monitoring network (Rephy)”

      (5) Line 59 - 'resulting in' instead of 'resulting of'.

      We agree with the reviewer and have replaced “resulting of” with “resulting in”.

      (6) Line 65 - It seems that ', influencing the time of appearance of blooms' would be more correct than the current phrasing. The current phrasing is unclear regarding the relation between species, tolerance range, and the time of appearance of blooms.

      To address this point, “Depending on the phytoplankton species, the tolerance range of physicochemical parameters is different and influences the time of appearance of blooms” was replaced by “Depending on the species of phytoplankton, tolerance to physicochemical parameters varies, which influences when blooms occur.”

      (7) Line 76 - Run-on sentence which should probably be split after the reference to Wang et al., 2020.

      We agree with the reviewer and have split the sentence.

      (8) Line 89 - What are these observations?

      This sentence was reformulated.

      “Based on observations from the natural environment showing a potent relationship between Vibrio and Alexandrium algae bloom events, this study aim to determine in vitro, the main factors implicated in this relationship” was replaced by ”This study aims to describe observations made in the natural environment between Vibrio bacteria and Alexandrium algal blooms, and to determine in vitro the main factors involved in this relationship.”

      (9) Line 94 - This is the first clear reference to a predator-prey interaction, and it is stated as if it's established. Is it not a central goal of the study to demonstrate that predation is even happening?

      Based on the title and abstract, I would have expected the major claims of the paper highlighted in the abstract to be:

      (i) that predation of algae by bacteria occurs in this system,

      (ii) there is a social component of predation,

      (iii) claims about what induces this predatory behaviour.

      The summary has been amended accordingly, and the term “predation” has been removed, along with all sentences referring to it.

      (10) Line 99 - What does n.d. mean?

      This point was addressed in the revised version.

      (11) Line 97 section - specify qPCR.

      This point was clarified in the revised version.

      (12) Line 139 - Mentioning the oligonucleotides in this part of the methods seems out of place. Would this not fit better in the section on Gene expression analysis?

      This sentence was discarded from this paragraph.

      (13) Line 147 - Where did the co-cultured phytoplankton species come from?

      To answer this point, reference to Table 2 was added

      (14) Line 149 - Is it known if the phytoplankton strains had all grown to the same density after 24 hours?

      The doubling time of dinoflagellates in laboratory culture is between 5 and 7 days. During the duration of the experiments, the dinoflagellate concentration did not change significantly.

      The sentence “(doubling time between 5 and 7 days)” was added

      (15) Line 150 - Was the density of the Vibrio cultures at the different incubation times measured? Density might play an important role in predation, and so it would be important to control for density in these assays.

      The concentrations of live vibrio in each individual culture were not actually measured. However, the role of vibrio density in attacks was measured and is shown in Figure 4A and observed in Fig 2B.

      (16) Line 153 - How long was the co-incubation?

      The incubation times were added in the revised version.

      (17) Line 158 - What is mean by "independent experiments", more exactly?

      To clarify this point, “Data are the means of three independent experiments” was replaced by “The data come from three independent experiments using independent phytoplankton cultures and independent bacterial cultures.”

      (18) Line 161 - Perhaps give the source information about the Vibrio strain at its first mention.

      A reference has been added in the revised preprint.

      (19) Line 163 - line 141 refer to multiple non-axenic species, whereas here "the algal strain" is referred to.

      And

      (20) Line 164 - language phrasing throughout the manuscript could use some polishing, e.g., "this means that additional bacteria...".

      To address this comment, “As the algal strain used in the study is not axenic, means that additional bacteria, other than the V. atlanticus LGP32, are potentially present in the experiments.” was replaced by “As the A. pacificum ACT03 strain (table 2) used in the study is not axenic, there is potential for bacteria other than V. atlanticus LGP32 to be present in the experiments.”

      (21) Line 208 - Why were both magnitude and p-value criteria used rather than just p-values?

      In the present proteomic approach each experimental condition was measured six times, and the average (mean) value was used to reduce random noise. Then we selected differences that had to be large enough to matter biologically, this is a central criterion and at least a 2-fold change was considered to focus exclusively on biologically relevant differences, which allowed us to control for the effect size. However, the differences also had to be statistically significant, we applied a statistical confidence at P < 0.01, to be sure that there is less than a 1% chance the result happened randomly. In the present proteomic approach each experimental condition was measured six times, and the average (mean) value was used to reduce random noise.

      Then we selected differences that had to be large enough to matter biologically, this is a central criteria and at least a 2-fold change was considered to focus exclusively on biologically relevant differences, which allowed us to control for the effect size. However, the differences also had to be statistically significant, we applied a statistical confidence at P < 0.01, to be sure that there is less than a 1% chance the result happened randomly. We considered that using both criteria makes the results meaningful and trustworthy, not just a small or random fluctuation.

      (22) Line 270 - Were these three replicate experiments also "independent"; if yes, in what sense?

      “All experiments were conducted in triplicate” was replaced by “The experiments were performed using biological triplicates, each of which was analyzed in triplicate.”

      (23) Line 296 - Perhaps "the temperature-sensitivity (or resistance) of" rather than "the nature of".

      The modification was made in the new manuscript.

      (24) Line 307 - The sentence mentions only one influential period that was removed from the dataset, but the word 'whenever' suggests multiple occurrences.

      We agree, “whenever” was replaced by “because”.

      (25) Line 325 - line 327 - The rationale behind the first part of the following sentence isn't clear to me, and what is meant by the second part is also not clear.

      To clarify this point, “This result is consistent with the difficulty that Vibrio has in growing at temperatures below 20°C and with the complex interacting factors driving bloom dynamics (Laanaia et al., 2013)” was replaced by “This result is consistent with the difficulty Vibrio has in growing at temperatures below 20°C and with the many environmental factors that influence the dynamics of algae proliferation (Laanaia et al., 2013)."

      (26) Line 327 - line 328 - Hard to interpret; does this refer to living algal cells, or all algal cells, living and degraded?

      To improve clarity, “Interestingly, in spring 2015, the mean densities of all Alexandrium cells and of free-living Vibrio were positively correlated” was replaced by “Interestingly, in spring 2015, the mean densities of Alexandrium cells (living and degraded) and of free-living Vibrio were positively correlated”

      (27) Figure 2 - These results strongly point to predation, but why the Vibrio population would already be elevated in the co-culture treatment relative to the control immediately after inoculation (0 hrs) is not clear.

      The experiments were not conducted at the same time, and the first value on the graphs corresponds to the concentration of vibrio determined after 1 hour of exposure/incubation and not at time 0. Figures 2A and 2B have been modified accordingly, and substantial changes have been made to the relevant section of the results.

      (28) Line 348 - There's no mention of Figure 2C in the main text, or of the statistical test associated with it in the Figure 2 legend.

      To address this comment, Figure 2C has now been cited in the main text, and the statistical analysis method has been added to the Figure 2 caption.

      (29) Line 352 - Text descriptions of videos are not easy to connect with the video content. Label the file names the same as how they are referred to in the text.

      We agree with you, the sentence “Epifluorescence microscopy observation of GFP-labelled V. atlanticus LGP32 (previously grown in Zobell medium) in interaction showed that A. pacificum ACT03 cells that had lost their motility were attacked individually by V. atlanticus LGP32 before being lysed (Fig, 2C and Video 1). “was rephrased and replaced by “Epifluorescence microscopy observation of GFP-labelled V. atlanticus LGP32 (previously grow in Zobell medium) in interaction showed that V. atlanticus LGP32 simultaneously attacks A. pacificum ACT03 cells (Fig, 2C and Video 1).”

      (30) Movie 1 could be cut to remove uninteresting footage at the start. What indicates lysis? Is the deformation of the cells an indication of lysis?

      To respond to this comment, Video 1 has been shortened and in the caption, “degraded” was replaced by “lysed”

      (31) Line 353 - Video could be zoomed in more on a few typical attacks to remove visual noise.

      A chronological overview of an attack has been added to Figure 2 corresponding to Figure 2D, and a chronological overview of the overall event has been added to Figure 3 corresponding to Figure 3B1.

      (32) Line 355 - There does not seem to be a Figure 3A2.

      To address this point, the Fig. 2 and Fig. 3 has been revised for more clarity. See above

      (33) Figure 3 - Can the authors fully exclude an effect of bacterial density as distinct from an effect of growth/starvation phase? It would be helpful to determine bacterial viable population densities at 12, 36, 60, and 126 hrs of incubation in Zobell medium, and to control for density in testing for effects on algae.

      Information on Vibrio densities incubated in Zobell medium for 12, 36, 60, and 126 hours has been now included in the results section “Attack of A. pacificum ACT03 is activated by V. atlanticus LGP32 starvation.”

      (34) Line 363 - It is unclear how the degradation of the flagella is apparent from movie 3. It would be helpful to have a comparison with healthy flagella.

      Alexandrium cells with intact flagella move so quickly that it is impossible for us to follow them and film their flagella with the tools at our disposal.

      For greater clarity, arrows have been added to videos 3, 4 and 5.

      (35) Line 364 - Sudden change from referring to the recording as 'video' instead of movie. What is meant by erratic swimming? The cell does not seem to move much.

      To address this comment, “Movie” was replaced by “Video” throughout the manuscript and “erratic swimming” was replaced by “irregular swimming”

      (36) Line 365 - How did you observe the detachment of the flagellum?

      The detachment of the flagellum can be observed using a confocal microscope. This process was filmed and presented in Video 3. Arrows have been added to the video to clearly indicate the flagellum detachment.

      (37) Line 368 - Perhaps this is due to it not being clear regarding which movie is meant, but there is no clear attack visible in movie 4.

      To make this clearer, arrows have been added to the video 4 to indicate attached cells.

      And the sentence in the caption of the video 4 “Vibrio, filmed under a confocal microscope, attacks in groups one immobilized Alexandrium cell then moves on to attack — still as a group — another cell without touching the other whole cells, suggesting active communication between Vibrio cells” was rewritten and replaced by “This video, recorded under a confocal microscope, shows Vibrios simultaneously attacking a first immobilized Alexandrium cell, then moving on to attack a second cell without ever targeting the other cells present, suggesting active communication between the Vibrio bacteria.”

      (38) Line 369 - It seems the peak attach % was reached at 45 minutes, not 15-30 minutes.

      Sorry for the confusion. In fig. 3 for more clarity, the sentence “(A) Percentage of A. pacificum ACT03 motile cells. (B) cells attacked by V. atlanticus LGP32 and (C) cells lysis after 0, 15, 30, 45 and 60 min of interaction” was replaced by “(A) Cumulative percentage of motile A. pacificum ACT03 cells. (B) Cumulative number of cells attacked by V. atlanticus LGP32 and (C) Cumulative cell lysis after 0, 15, 30, 45 and 60 minutes of interaction.”

      (39) Line 382 - "clearly show role of nutrient limitation", see comment re controlling for any role of bacterial density.

      To address this point, information’s on Vibrio densities were added in the manuscript. See cf comment 33.

      (40) Line 385 - line 386 - Phrasing unclear.

      We have revised the text accordingly, “To this aim, A. pacificum ACT03 in exponential growth phase was first exposed for 30 min to supernatant from 60 hours starved V. atlanticus LGP32 Zobell media that induced 25% lysis of A. pacificum ACT03 cells and next to the corresponding V. atlanticus LGP32 cells. Group attacks were observed on non-degraded A. pacificum ACT03 cells, but not on lysed cells.“ was replaced by “To this end, A. pacificum ACT03 in exponential growth phase was first exposed for 30 minutes to the supernatant of a 126-hour culture of V. atlanticus LGP32, which induced lysis of 70% of the A. pacificum ACT03 cells (Figures 3C and 3C1, arrow 2 and video 4). Next, cells of V. atlanticus LGP32 from a 60-hour culture, capable of attacking A. pacificum ACT03 cells (Fig. 3B), were added. For 1 hour of exposure, no attack was observed on the previously lysed algae.”

      (41) Line 413 - Is this the only pathway for quorum sensing in V. atlanticus?

      Indeed, the last two sentences of this paragraph are unclear.

      To address this point:

      “By targeted mutagenesis of key genes involved in QS pathways ΔluxM (HAI-1 production), ΔluxS (AI-2 production) and ΔluxR (high-density QS master regulator) did not lead to any change in the attack behaviour of V. atlanticus LGP32 (Fig. 4C).” was replaced by “Targeted mutagenesis of key genes involved in two of the three known QS pathways in vibrios (Fig. S3), ΔluxM (HAI-1 production), ΔluxS (AI-2 production), and ΔluxR (main high-density QS regulator), did not result in any changes in the attack behavior of V. atlanticus LGP32 (Fig. 4C).”

      And “Taken together these results showed that attack by V. atlanticus LGP32 is not link to QS.” was replaced by. “Combined with the absence of overexpression of the CqsS gene (inducible by CAI-1) involved in the last known QS pathway in Vibrio (Fig. S3), these results indicated that the attack by V. atlanticus LGP32 is most likely unrelated to QS.”

      (42) The references to tropism aren't clear.

      You're right, there's no reason to use the term tropism here. We have removed it.

      (43) Line 439 - Why was H3BO4 used as a control for the addition of FeCl3?

      For clarity, the sentence “Boron being known to be a regulator or capable of being transported by vibrioferrin (Romano et al., 2013; Weerasinghe et al., 2013), we tested its potential involvement in the interaction but no effect was evidenced here.” was replaced by “Given that boron is known for its role in regulating a global bacterial cellular response to phytoplankton and to bind to vibrioferrin (Romano et al., 2013; Weerasinghe et al., 2013), we tested its potential involvement in simultaneous vibrio attacks. Compared to the Zobell control, no effect on the number of attacks was observed”

      (44) Line 441 - line 449 - Should explicitly say in text that no attacks were observed for any species other than the Alexandrium and Gymnodinium species.

      We agree and have explicitly stated in the text that no attacks were observed for any species other than Alexandrium and Gymnodinium.

      (45) Line 454 - line 455 - The last part of this sentence seems a strange statement, since

      (i) it has long been know that predatory bacteria can eat a wide range of eukaryotes, ii) one of the cited papers (Perez et al) actually highlights a case of bacterial predation on algae, and iii) in the next paragraph the authors themselves highlight Streptomyces predation of algae.

      To make this clearer, « Among predators, predatory bacteria are found in a wide variety of environments, and like bacteriophages and predatory protists, they have been reported to prey exclusively on other bacteria » was replaced by “Among predators, predatory bacteria are found in a wide variety of environments and, like bacteriophages and predatory protists, feed primarily on other bacteria, although a few cases of predation on microbial eukaryotes have also been reported.”

      (46) Line 455 - Better to clarify the authors' definition of a predator at the start of the paper. The offered definition seems more like a definition of 'consumer' than 'predator', as the latter normally involves both the killing and consumption of other organisms, not just consumption with some kind of "expense".

      To address this comment:

      - “predator behaviour” was replaced by “predator-like behaviour”

      - and “Considering predator as a free organism that feeds at the expense of another, this study is the first evidence of the capacity of some Vibrio to develop a predatory strategy against an alga. This behaviour differs from parasitism, because the survival of Vibrio is not exclusively dependent on algae in environment” was replaced by “Consider a predator as a free-living organism that kills its prey and feeds on it, this study provides data suggesting the ability of Vibrios to develop an original predator-like behaviour to kill and feed on algae.”

      (47) Line 457 - Don't see the benefit of trying to distinguish from parasitism here, especially since parasitism can be facultative, whereas the authors' phrasing suggests that it is always obligate.

      You are right, this sentence has been deleted.

      (48) Line 463 - line 464 - The authors should clearly explain exactly what detailed aspects of Myxococcus and Lysobacter predation they think the "attack stage" of V. atlanticus resembles.

      Accordingly, “The second stage, the ‘attack stage’ corresponding to physical contact between Vibrio and Alexandrium resembles the ‘wolf-pack attack’ strategy described for Myxococcus xanthus and Lysobacter regardless of the prey species used, M. xanthus must be in close proximity to prey cells in order to induce their lysis and to benefit from their biomass (Martin, 2002; Perez et al., 2014)” was replaced by “The second stage, the ‘attack stage’ corresponding to the physical contact between Vibrios and Alexandrium, is similar to the strategy used by Myxococcus xanthus and Lysobacter. These bacteria must be in close proximity to their prey in order to cause lysis and utilize their biomass, regardless of the prey's species (Martin, 2002; Genovesi et al., 2013; Perez et al., 2016; Zhang et al., 2020)”

      (49) Line 466 - line 467 - The comparison to bacteria clustering around lysed cells is surprising since the authors show that V. atlanticus does not attack already lysed cells.

      The sentence was rephrased, “This phenomenon is comparable to that of bacteria clustering around lysed ciliate cells “was replaced by “Visually, this phenomenon resembles bacteria clustering around lysed ciliate cells.”

      (50) Line 469 - Missing is a statement of exactly what criteria constitute "wolf-pack hunting behaviour" and exactly how V. atlanticus meets those criteria.

      To address this point, “wolf-pack hunting behaviour” was replaced by “predator-like behaviour”

      'Able of' should be corrected to 'Capable of'.

      We agree and have reworded the sentence.

      (51) Line 470 - Consider starting a new paragraph for the material on quorum sensing.

      Accordingly, we have separated the section concerning QS pathway from the section concerning iron pathway.

      (52) As part of their discussion on the role of iron uptake, can the authors comment on any relationship between starvation and iron uptake, and in particular the observations that, while general nutrient deprivation induces attacks, supplementation with a specific nutrient (iron) also induces attacks (Figure 4D)? Do bacteria starved for general growth substrates take up more iron than growing bacteria?

      To respond to this comment, “Future study could demonstrate further the role of vibrioferrin in group attack, by adding iron-saturated vibrioferrin to algae-Vibrio co-cultures.” was replaced by “Interestingly, if a general nutrient deficiency causes attacks, iron supplementation increases the number of attacks (Figure 4D), suggesting the importance of iron absorption in the attack behavior. Future studies should determine whether nutrient deficiency increases the iron absorption capacity of Vibrios and whether this plays a major role in the attack mechanism.”

      (53) Line 486 - Of what is boron known to be a regulator?

      To respond to this comment, “Given that boron is known for its regulatory properties and for being transportable by vibrioferrin“ was replaced by “Given that boron is known for its role in regulating a global bacterial cellular response to phytoplankton and to bind to vibrioferrin”.

    2. eLife Assessment

      This important study convincingly shows that Vibrio bacteria act as predators of ecologically significant algae that contribute to harmful blooms in the lab and in their natural habitat, and that predation is induced by starvation. The authors suggest a working model that can be the basis for future work on this system. The study will be very impactful to those interested in the diversity of microbial predator-prey interactions and controlling toxic algal bloom.

    3. Reviewer #1 (Public review):

      Summary:

      Rolland and colleagues investigated the interaction between Vibrio bacteria and Alexandrium algae. The authors found a correlation between the abundance of the two in the Thau Lagoon and observed in the laboratory that Vibrio grows to higher numbers in the presence of the algae than in monoculture. Timelapse imaging of Alexandrium in coculture with Vibrio enabled the authors to observe Vibrio bacteria in proximity to the algae and subsequent algae death. The authors further determine the mechanism of the interaction between the two and point out similarities between the observed phenotypes and predator prey behaviours across organisms.

      Strengths:

      The study combines field work with mechanistic studies in the laboratory and uses a wide array of techniques ranging from co-cultivation experiments to genetic engineering, microscopy and proteomics. Further, the authors test multiple Vibrio and Alexandria species and claim a wide spread of the observed phenotypes.

      Comments on revisions:

      I thank the authors for their additional work on the manuscript. My comments were addressed to my satisfaction.

    4. Reviewer #2 (Public review):

      Goal summary

      The authors sought to (i) demonstrate correlations between the dynamics of the dinoflagellate Alexandrium pacificum and the bacterim Vibrio atlanticus in natural populations, ii) demonstrate the occurrence of predation in laboratory experiments, iii) demonstrate that predation is induced by predator starvation, and iv) test for effects of quorum sensing and iron-uptake genes on the predation process.

      Strengths include

      - Data indicating correlated dynamics in a natural environment that increase the motivation for study of in vitro interactions<br /> - Experimental design allowing clear inference of predation based on population counts of both prey and predators in addition to microscopy-based evidence<br /> - Supplementation of population-level data with molecular approaches to test hypotheses regarding possible involvement of quorum sensing and iron update in predation

      Weaknesses include

      - A quantitative analysis of effects of manipulating V. atlanticus density on rates of predation would have been valuable<br /> - Lack of clarity in some of the methodological descriptions

      Appraisal

      The authors convincingly demonstrate that V. atlanticus can prey on A. pacificum, provide strongly suggestive evidence that such predation is induced by starvation and clearly demonstrate that both iron availability and correspondingly the presence of genes involved in iron uptake strongly influence the efficacy of predation.

      Discussion of impact

      This paper will interest those interested in the diversity of forms of microbial predation and how microbial predatory behavior responds to environmental fluctuations. It will also interest those investigating bacteria-algae interactions and potential ecological controls of algal blooms. It may also interest researchers of microbial cooperation in light of the suggestion of communication between predator cells.

    1. eLife Assessment

      This important study has demonstrated that MORC2 undergoes phase separation in cells and established multiple interactions responsible for the phase separation. Upon revision, the data generally provide solid support to the claim that MORC2 condensates are functionally relevant in gene regulation and begins to demonstrate the importance of the physical properties of biological condensates. Nevertheless, there remains some weakness in the connection between condensates and function.

    2. Reviewer #1 (Public review):

      This work demonstrates that MORC2 undergoes phase separation (PS) in cells to form nuclear condensates, and the authors demonstrate convincingly the interactions responsible for this phase separation. Specifically, the authors make good use of crystallography and NMR to identify multiple protein:protein interactions and use EMSA to confirm protein:DNA interactions. These interactions work together to promote in vitro and in cell phase separation and boosted ATPase activity by the catalytic domain of MORC2.

      Moreover, the authors show solid evidence supporting their important claim that MORC2 PS is important for MORC2-mediated gene regulation. Exploring causal links between PS and function is an important need in the phase separation field, particularly as regards the role of condensates in gene regulation, and is a non-trivial matter. It is crucial and challenging to properly explore the alternative possibility that soluble complexes, existing in the same conditions as phase-separated condensates, are the functional species. The authors have attempted to address this concern by manipulating the physical nature of the MORC2 condensates using a killswitch (KS) peptide (MORC2 +KS), finding that reducing condensates dynamics results in a cellular phenotype very similar to that of the phase separation-deficient MORC2 condensates. While not fully ruling out the alternative, soluble-complex hypothesis, this experiment suggests that function is indeed localized inside the MORC2 condensates, and that perturbing the condensate can be functionally equivalent to removing condensate formation.

      The authors also look at several disease related mutants of MORC2. While most of these do not seem to have an obvious connection to the phase separation data, it is quite interesting that one mutant, E236G, displays similar intra-condensate dynamics compared to MORC2 +KS, strengthening the claim that MORC2 phase separation is important for function and suggesting that the observations in this paper may indeed have some disease relevance.

      Strengths

      Static light scattering and crystallography are nicely used to demonstrate the dimerization of MORC2FL and to discover the structure of the CC3 domain dimer, presumably responsible for the dimerization of MORC2FL (Figure 1).

      Extensive use of deletion mutants in multiple cell lines is used to identify regions of MORC2 that are important for forming condensates in the nucleus: the IBD, IDR, and CC3 domains are found to both be essential for condensate formation, while the CW domain plays an unknown role in condensate morphology (Figure 3). The authors use NMR to further identify that the IBD domain seems to interact with the first third of the centrally located IDR, termed IDRa, but not with the latter two thirds of the IDR domain (Figure 4). This leads them to propose that phase separation is the product of IDB:IDRa interaction, CC3 dimerization, and an unknown but important role for the CW domain.

      Based on the observation that removal of the NLS resulted in diffuse cytoplasmic localization, they hypothesized that DNA may play an important role in MORC2 PS. EMSA was used to demonstrate interaction between DNA and several MORC2 domains: CC1, CC2, IDR, and TCD-CC3-IBD. Further in vitro microscopy with purified MORC2 showed that DNA addition significantly reduces MORC2 saturation concentration (Figure 5).

      These assays convincingly demonstrate that MORC2 phase separates in cells and identifies the protein domains and interactions responsible for this phenomenon.

      Weaknesses

      The connection between condensates and function, while improved from the original manuscript, still has some weak points.

      The central experiment demonstrating that MORC2 condensates mediate function takes the form of RNA-Seq in MORC2 KO HeLa cells (Figure 6), rescued with WT, condensate-deficient mutants, and a KS peptide mutant that reduces dynamics by increasing homotypic protein interactions. The observation that rescuing with MORC2 +KS is ineffective, in a manner similar to rescue with condensate-deficient MORC2 mutants, suggests that unperturbed condensates are important for function. An alternative possibility, however, is that condensates are non-functional bystanders, and that the increased homotypic interactions present in MORC2 +KS result in stronger MORC2 +KS recruitment to condensates, reducing the pool of functional, dilute phase MORC2 +KS and squashing function via sequestration. Similar ideas have been explored by others for transcription factors (e.g. Chong et al, Mol Cell, 2022). This possibility is neither discussed nor ruled out. The absence of microscopy data showing similar localization of MORC2 and MORC2 +KS (particularly the amount of diffuse MORC2 outside condensates) amplifies this concern.

      The RNA-Seq data presented in Figure 6h also has some concerning qualities. Inter-replicate variability is higher than ideal, particularly for MORC2 deltaCC3. This may be a product of the transient transfection system used for these experiments, which inherently results in stochasticity. Specific sets of genes regulated by MORC2 are consistent with the main conclusion (Figure 6i, individual genes in 6h, showing that all mutants are more similar to one another than to WT MORC2), but global transcription shifts seem quite different between MORC2 condensate-deficient mutants and MORC2 +KS (Figure 6h heatmap), suggesting much more than simple condensate disruption is taking place. Together, this weakens the conclusion that MORC2 condensates are the functional form of MORC2.

    3. Reviewer #2 (Public review):

      Summary:

      The study by Zhang et al. focuses on how condensation of a chromatin-associated protein MORC2 regulates gene expression. Their study shows that MORC2 forms dynamic nuclear condensates in cells. In vitro, MORC2 phase separation is driven by dimerization and multivalent interactions involving the C-terminal domain but interplay with other parts of MORC2 too. A key finding is that the intrinsically disordered region (IDR) of MORC2 exhibits strong DNA binding. They report that DNA binding enhances MORC2's phase separation and its ATPase activity, offering new insights into how MORC2 contributes to chromatin organization and gene regulation. Authors correlate MORC2's condensate forming ability and material properties with its gene silencing function using a few variants. Moreover, they investigate the effect of disease-linked mutations in the N-terminal domain of MORC2 on its ability to form cellular condensates, ATPase activity and DNA-binding. Their work implies that proper material properties of MORC2 condensates may be important to their biological function.

      Strengths:

      The authors determined a 3.1 Å resolution crystal structure of the dimeric coiled-coil 3 (CC3) domain of MORC2, revealing a hydrophobic interface that stabilizes dimer formation. They present extensive evidence that MORC2 phase separates across multiple contexts, including in vitro, in cellulo, and in vivo. Through systematic cellular screening, they identified the C-terminal domain of MORC2 as a key driver of condensate formation. Biophysical and biochemical analyses further show that the IDR within the C-terminal domain interacts with the C-terminal end region (IBD) and also exhibit strong DNA-binding capacity (using 601 DNA), both of which promote MORC2 phase separation. Together, this study emphasizes that interactions mediated by multiple domains-CC3, IDR, and IBD- drives MORC2 phase separation. Additionally, the work uses a unique kill-switch peptide fused to the MORC2 sequence to disrupt its material properties -- this permits the authors to examine the link between material properties and transcription function. The study is overall strengthened by (1) the combination of variants tested both in vitro and in cellulo, and (2) the systematic examination of domain contributions that highlight the multivalent interactions at play mediating MORC2 condensation.

      Weaknesses:

      The employed MORC2 variants have enabled the beginning of an investigation linking condensation and biological function, but more work will be needed to really dissect the contribution of condensation to DNA-binding, ATPase activity, and gene silencing. A systematic investigation of differential material properties on MORC2 condensates will be needed to assess the link to biological function, especially as the authors' work is reminiscent of how the liquidity of Caulobacter crescentus PopZ condensates tunes bacterial fitness.

    4. Reviewer #3 (Public review):

      Summary:

      The manuscript by Zhang et al. demonstrates that MORC2 undergoes liquid-liquid phase separation (LLPS) to form nuclear condensates critical for transcriptional repression. Using a combination of in vitro LLPS assays, cellular studies, NMR spectroscopy, and crystallography, the authors show that a dimeric scaffold formed by CC3 drives phase separation, while multivalent interactions between an intrinsically disordered region (IDR) and a newly defined IDR-binding domain (IBD) further promote condensate formation. Notably, LLPS enhances MORC2 ATPase activity in a DNA-dependent manner and contributes to transcriptional regulation, establishing a functional link between phase separation, DNA binding, and transcriptional control.

      Strengths:

      The manuscript is well organized and logically structured. It provides valuable mechanistic insights into MORC2 function, and the majority of the conclusions are well supported by the data presented.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      This work demonstrates that MORC2 undergoes phase separation (PS) in cells to form nuclear condensates, and the authors demonstrate convincingly the interactions responsible for this phase separation. Specifically, the authors make good use of crystallography and NMR to identify multiple protein: protein interactions and use EMSA to confirm protein: DNA interactions. These interactions work together to promote in vitro and in cell phase separation and boost ATPase activity by the catalytic domain of MORC2.

      However, the authors have very weak evidence supporting their potentially valuable claim that MORC2 PS is important for the appropriate gene regulatory role of MORC2 in cells. Exploring causal links between PS and function is an important need in the phase separation field, particularly as regards the role of condensates in gene regulation, and is a non-trivial matter. Any study with convincing data on this matter will be very important. For this reason, it is crucial to properly explore the alternative possibility that soluble complexes, existing in the same conditions as phase-separated condensates, are the functional species. It is also critical to keep in mind that, while a specific protein domain may be essential for PS, this does not mean its only important function pertains to PS.

      In this study, the authors do not sufficiently explore the role that soluble MORC2 complexes may play alongside MORC2 condensates. Neither do they include enough data to solidly show that domain deletion leads to phenotypes via a loss of phase separation per se, rather than the loss of phase separation being a microscopically visible result, not cause, of an underlying shift in protein function. For these reasons, the authors' conclusions regarding the functional role of MORC2 condensates are based on incomplete data. This also dampens the utility of this work as a whole, since the very nice work detailing the mechanism of MORC2 PS is not paired with strong data showing the importance of this observation.

      We thank the reviewer for this thoughtful and constructive critique. We agree that establishing a causal link between phase separation (PS) and biological function—particularly in transcriptional regulation—is a central and non-trivial challenge in the condensate field. We also appreciate the reviewer’s emphasis on two critical alternative interpretations: (i) that soluble MORC2 complexes, rather than condensates, may represent the primary functional species, and (ii) that loss of phase separation upon domain deletion could reflect a downstream consequence of altered protein function rather than its cause.

      To address these concerns, we have performed a series of new experiments specifically designed to decouple condensate formation, and condensate dynamics, thereby allowing us to more rigorously interrogate the functional relevance of MORC2 condensates.

      First, to overcome the limitation of domain deletions which may affect MORC2 function beyond phase separation we introduced a micropeptide-based kill switch (KS) to the C terminus of MORC2. This strategy has recently emerged as a powerful approach to selectively reduce condensate dynamics without disrupting protein expression, folding, or domain architecture [1]. Importantly, unlike CC3 or IDRa deletions, MORC2+KS robustly form nuclear condensates but exhibits markedly reduced internal dynamics, as demonstrated by FRAP analyses showing minimal fluorescence recovery after photo bleaching (Fig. 6a-c). This strategy therefore allows us to perturb condensate material properties independently of MORC2 domain integrity.

      Second, we systematically compared the transcriptional consequences of rescuing MORC2-knockout HeLa cells with MORC2FL, condensation-deficient mutants (ΔCC3 and ΔIDRa), and the dynamics-defective MORC2+KS (Fig. 6d). Despite being expressed at substantially higher levels than MORC2FL (Fig. 6e), all three mutants showed a striking and consistent failure to restore MORC2-dependent transcriptional regulation (Fig. 6f-h). This effect was particularly pronounced for transcriptionally repressed genes, including two sets of high-confidence MORC2 targets reported in prior studies (Fig. 6i and Fig.S10). These findings demonstrate that neither increased protein abundance nor the mere presence of condensate-like structures alone is sufficient to restore MORC2 function.

      Third, our data instead support a model in which both soluble MORC2 complexes and dynamic MORC2 condensates are required for full transcriptional regulation activity. While soluble MORC2 is likely involved in target recognition and complex assembly, our results indicate that proper condensate formation—and critically, condensate dynamics—are essential for effective transcriptional repression and activation. The inability of the MORC2+KS mutant to rescue transcriptional defects, despite intact condensate formation, points away from a model in which MORC2 condensates represent only microscopically visible byproducts of MORC2 activity.

      We believe these new data strengthen the manuscript by pairing the detailed mechanistic dissection of MORC2 phase separation with direct functional evidence, enhancing the conceptual impact and biological significance of the study.

      Strengths:

      Static light scattering and crystallography are nicely used to demonstrate the dimerization of MORC2FL and to discover the structure of the CC3 domain dimer, presumably responsible for the dimerization of MORC2FL (Figure 1).

      Extensive use of deletion mutants in multiple cell lines is used to identify regions of MORC2 that are important for forming condensates in the nucleus: the IBD, IDR, and CC3 domains are found to be essential for condensate formation, while the CW domain plays an unknown role in condensate morphology (Figure 3). The authors use NMR to further identify that the IBD domain seems to interact with the first third of the centrally located IDR, termed IDRa, but not with the latter two-thirds of the IDR domain (Figure 4). This leads them to propose that phase separation is the product of IDB:IDRa interaction, CC3 dimerization, and an unknown but important role for the CW domain.

      Based on the observation that removal of the NLS resulted in diffuse cytoplasmic localization, they hypothesized that DNA may play an important role in MORC2 PS. EMSA was used to demonstrate interaction between DNA and several MORC2 domains: CC1, CC2, IDR, and TCD-CC3-IBD. Further in vitro microscopy with purified MORC2 showed that DNA addition significantly reduces MORC2 saturation concentration (Figure 5).

      These assays convincingly demonstrate that MORC2 phase separates in cells, and identify the protein domains and interactions responsible for this phenomenon, with the notable caveat that the role of the CW domain here is left unexplored.

      We appreciate the reviewer for their positive and detailed assessment of the strengths of our study. Our understanding of the CW domain’s function remains preliminary. Although we observed that the CW domain can influence condensate size, the IDR, IBD, and CC3 domains constitute the core structural elements driving phase separation. Consequently, the CW domain was not a primary focus of the current study. Nonetheless, investigating its functional contributions represents an interesting avenue for future work.

      Weaknesses:

      Although the authors demonstrated phase separation of MORC2FL, their evidence that this plays a functional role in the cell is incomplete.

      Firstly, looking at differentially upregulated genes under MORC2FL overexpression, the authors acknowledge that only 10% are shared with differentially regulated genes identified in other MORC2FL overexpression studies (Figure 6c, d). No explanation is given for why this overlap is so low, making it difficult to trust conclusions from this data set.

      We thank the reviewer for raising this important concern. In response, we have improved the quality and robustness of our RNA-seq analysis by repeating the experiments with optimized sample handling and increased sequencing depth. Using this updated dataset, we identified a considerably higher overlap between MORC2-regulated genes in our study and those reported previously.

      Specifically, we observed 84 overlapping genes with the study by Nikole L. Fendler et al. [2], corresponding to approximately 32% of the MORC2-regulated genes reported in that work (Fig. 6i). In addition, we identified 102 overlapping genes with the dataset reported by Iva A. Tchasovnikarova et al. [3], representing approximately 22% of the genes identified in that study (Fig. S10b).

      We note that complete concordance with previous reports is not expected, given substantial differences in experimental design. For example, Fendler et al. employed a doxycycline-inducible MORC2 expression system [2], whereas our study relies on transient overexpression in MORC2-knockout HeLa cells. In contrast, Tchasovnikarova et al. compared transcriptomes between MORC2 knockout and wild-type cells [3], rather than MORC2 rescue conditions. Moreover, RNA-seq results are inherently influenced by cell line batch variability, sequencing depth, and analysis pipelines, all of which differ across studies.

      Taken together, we consider an overlap in the range of ~20–30% to be reasonable and biologically meaningful in the context of these experimental differences, and we believe that the revised RNA-seq data provide a more reliable foundation for our conclusions regarding MORC2-dependent transcriptional regulation.

      Secondly, of the 21 genes shared in this study and in earlier studies, the authors note that the differential regulation is less pronounced when a phase-separation-deficient MORC2 mutant is overexpressed, rather than MORC2FL (Figure 6e). This is taken as evidence that phase separation is important for the proper function of MORC2. However, no consideration is made for the alternative possibility that the mutant, lacking the CC3 dimerization domain, may result in non-functional complexes involving MORC2, eliminating the need for a PS-centric conclusion. To take the overexpression data as solid evidence for a functional role of MORC2 PS, the authors would need to test the alternative, soluble complex hypothesis. Furthermore, there seems to be low replicate consistency for the MORC2 mutant condition (Figure S6a), with replicate 3 being markedly upregulated when compared to replicates 1 and 2.

      We thank the reviewer for raising these important concerns. In the revised manuscript, we have substantially strengthened both the experimental evidence and the data presentation to directly address the alternative “soluble complex” interpretation as well as the issue of replicate consistency. Specifically, we now provide data that clarify the functional impact of phase-separation-deficient MORC2 mutants and explicitly show replicate-level RNA-seq analyses. The Fig. 6 and Fig. S10support these improvements and enhance both the robustness and transparency of our transcriptional analyses. Collectively, these revisions directly address the reviewer’s concerns regarding the functional interpretation of MORC2 phase separation.

      Thirdly, the authors close by examining the in-cell PS capabilities and ATPase activity of several disease-associated mutants of MORC2 (Figure 7). However, the relevance of these mutants to the past 6 figures is unclear. None of these mutations is in regions identified as important for PS. Two of the mutations result in a higher percentage of the cell population being condensate-positive, but this is not seemingly connected to ATPase activity, as only one of these two mutants has increased ATPase activity. Figure 7 does not add any support to the main hypotheses in the paper, and nowhere in the paper do the authors investigate the protein regions where the mutations in Figure 7 are found.

      We thank the reviewer for raising this point regarding Fig. 7. At the current stage, the results for disease-associated mutations are primarily descriptive. While we observed that certain mutations clustered at the N-terminus can affect MORC2 condensate formation, ATPase activity, and DNA binding, we did not identify a mechanistic explanation for these correlations. Notably, the T424R mutation, previously reported to significantly enhance ATPase activity [4], also increased both intracellular condensate formation and in vitro DNA binding in our experiments. In contrast, other mutations did not show such consistent effects. Previous studies have established that MORC2’s ATP-binding and DNA-binding activities are independent [4]. Our results further suggest that MORC2’s phase separation behavior is independent of both ATP and DNA binding affinity, although existing evidence hints at potential cross-regulatory interactions among these three functions.

      We would also like to emphasize an additional observation that may help contextualize the relevance of N-terminal mutations. Although deletion of the MORC2 N-terminus does not prevent the remaining C-terminal region from forming nuclear condensates, these C-terminal condensates exhibit a marked loss of fluorescence recovery in FRAP assays (Fig. S11). This finding suggests that while the N-terminus is not strictly required for condensate assembly, it plays an important role in regulating condensate fluidity. Accordingly, disease-associated mutations distributed across the N-terminal region may influence MORC2 function by modulating condensate material properties rather than condensate formation per se. Based on this hypothesis, we evaluated the fluidity of condensates formed by the E236G and T424R mutants. FRAP measurements indicated substantially reduced fluorescence recovery in E236G, whereas T424R exerted minimal effects (Fig. 7e, f).

      Overall, our interpretation of the results in Fig. 7 is still at a preliminary stage. Nevertheless, the role of the MORC2 N-terminus in modulating condensate fluidity, together with the observed impairment caused by the E236G mutation, appears to be robust, although the underlying mechanism remains to be elucidated. We have incorporated additional discussion on this point and consider it an important direction for future study.

      Reviewer #1 (Recommendations for the authors):

      (1) Why does MORC2 overexpression lead to changes in gene regulation that are so different from past MORC2 overexpression studies? This is unsettling to me.

      (2) Likewise, why is replicate 3 for the MORC2ΔCC3 variant so different from replicates 1 and 2? Perhaps repeating this experiment would be helpful, both for showing better repeatability and perhaps as regards pulling out a stronger phenotype.

      We have repeated the experiments and obtained improved data quality.

      (3) A better explanation of the relevance of Figure 7 to the story of the rest of the paper, especially the phase-separation of MORC2, would be important to improving this paper.

      We thank the reviewer for this suggestion. We have performed additional experiments and expanded the discussion.

      (4) Are expression levels of mutant proteins in Figure 7 uniform between mutants? If not, is it possible that expression levels might account for the difference in condensate-positive cells between mutants?

      We cannot fully exclude the possibility that differences in expression levels may contribute to the observed differences among mutants. In our experiments, equal amounts of plasmid DNA were used for transfection across all conditions. Although we did not directly quantify post-transfection protein expression levels by immunoblotting or similar approaches, even if certain mutations were to affect protein expression, it would be technically challenging to further optimize the strategy to fully normalize expression levels across mutants.

      Importantly, we note that MORC2 does not form condensates in all transfected cells, even when EGFP fluorescence indicates robust expression levels that are comparable to, or even exceed, those observed in condensate-positive cells. This observation suggests that high expression alone is not sufficient to drive MORC2 phase separation in cells. Therefore, we do not favor the interpretation that the E236K and T424R mutations enhance MORC2 condensation simply by increasing MORC2 protein expression levels.

      Minor:

      (1) I would suggest considering using the term "dynamic" rather than "liquid-like", as FRAP is technically a measurement of the dynamicity of a protein within a volume, rather than a measurement of the actual fluidity of that volume.

      We thank the reviewer for this helpful suggestion. We agree that FRAP measurements primarily report protein mobility and condensate dynamics rather than the physical fluidity of the condensates. We have therefore revised the manuscript to replace “liquid-like” with “dynamic” where conclusions are based on FRAP analyses.

      (2) A further investigation of the role of the CW domain would be very interesting, since it clearly has a major role in condensate morphology. Perhaps CW confers important heterotypic interactions which contribute to compositional control of the MORC2 condensates, and thus function and morphology? However, due to the complexity of this specific question and the potentially marginal improvement offered by this paper, I do not think this is a critical addition.

      We thank the reviewer for this insightful suggestion. We have noted this possibility in the Discussion as an important avenue for future investigation.

      (3) Why is TCD not tested alone by EMSA for affinity to DNA in Figure 5?

      Our inference regarding the DNA-binding capacity of the TCD domain was based on comparative EMSA analyses. Specifically, we found that the TCD–CC3–IBD fragment was able to bind DNA, whereas the CC3–IBD fragment alone showed no detectable DNA binding. From this comparison, we inferred that the TCD domain is responsible for the observed DNA-binding activity.

      Because the TCD domain does not affect MORC2 condensate formation, it was not a central focus of the present study, which primarily aims to elucidate the mechanisms underlying MORC2 phase separation and its functional relevance. For this reason, we did not further test TCD alone by EMSA in Figure 5.

      Reviewer #2 (Public review):

      Summary:

      The study by Zhang et al. focuses on how phase separation of a chromatin-associated protein MORC2, could regulate gene expression. Their study shows that MORC2 forms dynamic nuclear condensates in cells. In vitro, MORC2 phase separation is driven by dimerization and multivalent interactions involving the C-terminal domain. A key finding is that the intrinsically disordered region (IDR) of MORC2 exhibits strong DNA binding. They report that DNA binding enhances MORC2's phase separation and its ATPase activity, offering new insights into how MORC2 contributes to chromatin organization and gene regulation. The authors try to correlate MORC2's condensate-forming ability with its gene silencing function, but this warrants additional controls and validation. Moreover, they investigate the effect of disease-linked mutations in the N-terminal domain of MORC2 on its ability to form cellular condensates, ATPase activity, and DNA-binding, though the findings appear inconclusive in the manuscript's current form.

      Thank you for your thorough and constructive review of our manuscript. In response to the concerns raised regarding the functional relevance of MORC2 condensate formation, we have redesigned and expanded the experiments presented in Fig. 6 and Fig. S6 to directly link MORC2’s condensate-forming capacity with its transcriptional regulatory function. These new experiments provide additional controls and validation, strengthening the causal relationship between MORC2 condensate dynamics and gene regulation.

      At the current stage, the results for disease-associated mutations are descriptive. While we observed that certain mutations clustered at the N-terminus can affect MORC2 condensate formation, ATPase activity, and DNA binding, we did not identify a mechanistic explanation for these correlations. Notably, the T424R mutation, previously reported to significantly enhance ATPase activity [4], also increased both intracellular condensate formation and in vitro DNA binding in our experiments. In contrast, other mutations did not show such consistent effects. Previous studies have established that MORC2’s ATP-binding and DNA-binding activities are independent [4]. Our results further suggest that MORC2’s phase separation behavior is also independent of both ATP and DNA binding, although existing evidence hints at potential cross-regulatory interactions among these three functions.

      Strengths:

      The authors determined a 3.1 Å resolution crystal structure of the dimeric coiled-coil 3 (CC3) domain of MORC2, revealing a hydrophobic interface that stabilizes dimer formation. They present extensive evidence that MORC2 undergoes liquid-liquid phase separation (LLPS) across multiple contexts, including in vitro, in cellulo, and in vivo. Through systematic cellular screening, they identified the C-terminal domain of MORC2 as a key driver of condensate formation. Biophysical and biochemical analyses further show that the IDR within the C-terminal domain interacts with the C-terminal end region (IBD) and also exhibits strong DNA-binding capacity, both of which promote MORC2 phase separation. Together, this study emphasizes that interactions mediated by multiple domains-CC3, IDR, and IBD- drives MORC2 phase separation. Finally, the authors quantified the effect of removing the CC3 on the upregulation and downregulation of target gene expression.

      We thank the reviewer for their appreciation of the key findings presented in this manuscript.

      Weaknesses:

      Though the findings appear compelling in isolation, the study lacks discussion on how its findings compare with previous studies. Particularly in the context of MORC2-DNA binding, there are previous studies extensively exploring MORC2-DNA binding (Tan, W., Park, J., Venugopal, H. et al. Nat Commun 2025), and its effect on ATPase activity (ref 22). The contradictory results in ref 22 about the impact of DNA-binding on ATPase activity, and ATPase activity on transcriptional repression, warrant proper discussion. The authors performed extensive in-cellulo screening for the investigation of domain contribution in MORC2 condensate formation, but the study does not consider/discuss the possibility of some indirect contributions from the complex cellular environment. Alternatively, the domain-specific contributions could be quantified in vitro by comparing phase diagrams for their variants. While the basis of this study is to investigate the mechanism of MORC2 condensate-mediated gene silencing, the findings in Figure 6 appear incomplete because the CC3 deletion not only affects phase separation of MORC2 but also dimerization. Furthermore, their investigation on disease-linked MORC2 mutations appears very preliminary and inconclusive because there are no obvious trends from the data. Overall, the discussion appears weak as it is missing references to previous studies and, most importantly, how their findings compare to others'.

      We thank the reviewer for their careful assessment of MORC2’s DNA-binding properties and its relationship with ATPase and transcriptional activities. We would like to offer the following clarifications to address these concerns, which will also be incorporated into the Discussion section of the revised manuscript.

      First, recent work by Tan et al. [5] similarly identified multiple DNA-binding sites in MORC2, consistent with our findings, though there are discrepancies in the precise binding regions. In particular, they reported that isolated CC1 and CC2 domains do not bind 60 bp dsDNA, which contrasts with our observations. We attribute this difference to the types of DNA used in the assays. In our study, we employed 601 DNA, a defined nucleosome-positioning sequence, which differs substantially from randomly designed short dsDNA. For instance, prior work by Christopher H. Douse et al. [54] also confirmed that MORC2’s CC1 domain can bind 601 DNA.

      Second, in the study by Fendler et al. [2], DNA binding was reported to reduce MORC2’s ATPase activity—an observation that appears inconsistent with the results presented in our Fig. 5j. A critical distinction between the two studies lies in the experimental systems used: Fendler et al. [2] employed MORC2 constructs and 35 bp double-stranded DNA (dsDNA), whereas our experiments utilized full-length MORC2 and 601 bp DNA (a sequence with high nucleosome assembly potential). These differences including the absence of potentially regulatory C-terminal regions in the truncated construct and the varying length/structural properties of the DNA substrates introduce variables that substantially complicate direct comparative analysis of ATPase activity outcomes.

      Separately, Douse et al. [4] demonstrated that the efficiency of HUSH complex-dependent epigenetic silencing decreases as MORC2’s ATP hydrolysis rate increases, implying an inverse relationship between ATPase activity and silencing function. Notably, our current work has not established a direct mechanistic link between MORC2 phase separation and its ATPase activity. Thus, we refrain from inferring that the effect of MORC2 phase separation on transcriptional repression is mediated through modulation of its ATPase function this remains an important question to address in future studies.

      Finally, we have redesigned and expanded the experiments presented in Fig. 6 and Fig. S6 to directly link MORC2’s condensate-forming capacity with its transcriptional regulatory function.

      Reviewer #2 (Recommendations for the authors):

      Major concerns:

      (1) Unaddressed discrepancies with the previous study:

      (a) Inadequate discussion of Reference 22 and apparent contradictions. Notably, Reference 22 provides evidence for reduced ATPase activity upon DNA binding, in contrast to the current study's observations. Moreover, Reference 22 demonstrates that ATP hydrolysis (ATPase activity) is inversely associated with MORC2-mediated gene silencing, whereas this study concludes that 'the silencing function of MORC2 requires its ATPase activity'. These apparent contradictions warrant a more thorough discussion to reconcile the differences, including potential mechanistic explanations and experimental context that could account for the discrepancies. Additionally, the authors should discuss potential reasons why Ref. 22 may not have observed phase separation during MORC2 biophysical analysis. For instance, in Ref. 22, SEC-MALS was performed at 2 mg/mL (~16 µM) MORC2 FL in the presence of 150 mM NaCl, conditions that could influence phase behavior based on the current manuscript's results. Addressing whether differences in protein construct, buffer composition, or experimental design might account for this discrepancy would strengthen the discussion.

      We thank the reviewer for pointing out the apparent discrepancies between our results and those reported in Ref. 22. We agree that these differences warrant explicit discussion, and we have revised the Discussion accordingly to clarify the experimental and conceptual distinctions between the two studies.

      First, regarding the effect of DNA binding on ATPase activity, Ref. 22 examined MORC2 ATPase activity under conditions where MORC2 does not undergo detectable phase separation, whereas our ATPase assays were performed under conditions in which MORC2 readily forms condensates in the presence of DNA. We therefore propose that the observed increase in ATPase activity in our study may reflect a distinct biochemical regime in which phase separation and/or high local protein concentration modulates enzymatic activity. Importantly, our data do not exclude the possibility that DNA binding per se can inhibit ATPase activity under non-condensing conditions, as reported in Ref. 22.

      Second, with respect to transcriptional repression, Ref. 22 reported an inverse correlation between ATP hydrolysis and MORC2-mediated silencing, whereas our study finds that ATPase activity is required for efficient repression. We suggest that these observations are not necessarily contradictory but may reflect different regulatory layers of MORC2 function. Specifically, ATP binding and hydrolysis may be required for MORC2 structural remodeling and chromatin engagement, while excessive or dysregulated ATP hydrolysis could impair stable silencing complexes, as suggested previously [4]. We now explicitly discuss this possibility in the revised manuscript.

      Finally, we appreciate the reviewer’s suggestion regarding the absence of phase separation in Ref. 22. Indeed, SEC-MALS experiments in Ref. 22 were conducted at ~16 µM MORC2 in the presence of 150 mM NaCl (the purification condition is 500 mM NaCl, 10% glycerol), conditions that based on our phase diagrams—are close to or above the saturation concentration but also strongly influenced by ionic strength. This combination of factors explains why the UV peak from SEC-MALS is not indicative of a homogeneous sample [3].

      (b) The DNA binding capacity of individual MORC2 domains was tested in Fig. 5. IDR appears to be the strongest DNA binder among others. Is this the effect of IDR being isolated from the rest of the protein? A recent paper (Tan, W., Park, J., Venugopal, H. et al. Nat Commun 2025) also investigated DNA binding capacity of different regions of MORC2 using hydrogen-deuterium exchange experiments and EMSA. Interestingly, it can be seen in Figure S9 that the DNA binding capacity of different regions changes when compared together to when in isolation (MORC2 1-603 vs 1-265; 1-495; 496-603). In line with the above, MORC2 IDR's interaction with DNA warrants additional investigation, taking the system as a whole to avoid misinterpretation arising from non-specific interactions.

      We appreciate the reviewer’s insightful comments regarding domain-specific DNA binding and the potential caveats of studying isolated regions. In Figure 5, our EMSA analyses show that the isolated IDR exhibits the strongest DNA-binding signal among the tested fragments. We agree that this observation may, at least in part, reflect the removal of structural or regulatory constraints imposed by the full-length protein.

      Consistent with the reviewer’s point, Tan et al. [5] demonstrated that DNA-binding behavior of MORC2 regions differs when analyzed in isolation versus in the context of larger constructs. We have now incorporated this comparison into the Discussion and explicitly note that DNA binding by the IDR should be interpreted as a contextual and potentially cooperative property rather than an autonomous function.

      Importantly, our conclusions do not rely on the IDR acting as an independent DNA-binding module in vivo. Rather, we propose that the IDR contributes to DNA engagement and phase behavior within the architectural framework of full-length MORC2. We now emphasize this limitation and highlight the need for future studies that probe DNA binding in the context of intact MORC2 or minimally perturbed constructs.

      (2) MORC2 DNA binding impacting phase separation and ATPase activity:

      While it is clear that MORC2: DNA interaction facilitates MORC2 phase separation, the impact on ATPase activity is not conclusive. First, they observe an opposite trend (compared to ref. 22) for DNA binding on MORC2's ATPase activity. Secondly, it is not clear if the increase in ATPase activity is mediated by DNA binding or phase separation. The ATPase activity was measured at 1 µM MORC2 protein concentration in the presence of DNA, where MORC2 appears to phase separate. To draw more definitive conclusions, additional controls are necessary. Specifically, a phase separation-deficient mutant (from this study) and a DNA-binding-deficient mutant (see ref. 22) should be included to disentangle the contributions of DNA binding and phase separation to ATPase activity. The choice of ATP-binding-deficient mutant N39A as a negative control seems inconclusive in this regard. Additionally, why is there an increase in ATP hydrolysis rate for the ATP-binding-deficient mutant in the presence of DNA, resulting in ATP hydrolysis rates similar to WT MORC2? This raises further questions about the underlying mechanism.

      We agree with the reviewer that disentangling the contributions of DNA binding and phase separation to ATPase activity is challenging and that our current data do not fully resolve this issue. As noted, ATPase assays were performed at protein concentrations (1 µM) where MORC2 undergoes DNA-induced phase separation, making it difficult to distinguish whether enhanced ATP hydrolysis arises directly from DNA binding or indirectly from condensate formation.

      We acknowledge that inclusion of additional mutants such as phase separation deficient or DNA-binding deficient variants would provide a more definitive mechanistic separation of these effects. However, generating and validating such mutants in a manner that preserves overall protein integrity is beyond the scope of the current study. Accordingly, we have revised the text to present our findings more cautiously and to frame the observed ATPase enhancement as a correlation rather than a causal mechanism.

      Regarding the ATP-binding–deficient N39A mutant, we agree that its behavior in the presence of DNA raises interesting mechanistic questions. We now explicitly note this unexpected observation and discuss possible explanations, including partial ATP binding, altered oligomeric states, or indirect effects mediated by condensate formation.

      (3) Dissecting the domain-specific contribution in MORC2 phase separation:

      (a) While in cellulo data indicate that the presence of IDR, NLS, CC3, and IBD is all essential for MORC2 condensate formation, it is not clear if this is the effect of the complex cellular environment or whether it is intrinsic for MORC2 phase separation ability. In lines 256-259, the authors suggest IDRa interaction with IBD may serve as a nucleation mechanism for LLPS. In other places, it has been mentioned that CC3 dimerization acts as a scaffold for condensate formation. It is not clear if all of these are essential for MORC2 phase separation, or one of them is essential while the other domain(s) facilitates the phase separation. Though Figure 3 provides a qualitative overview of the contribution of different regions in MORC2 phase separation in cellulo-influenced by the complex cellular environment and substrate interactions, the absolute domain contribution in phase separation would be better studied in vitro by quantitatively comparing phase diagrams (for example, c-sat vs temperature) of different domain deletion constructs.

      We thank the reviewer for highlighting the distinction between intrinsic phase separation propensity and cellular context dependent effects. Our in cellular screening was designed to identify regions required for condensate formation under physiological conditions, where chromatin, binding partners, and macromolecular crowding are present. We agree that this approach does not directly quantify the intrinsic phase separation contribution of individual domains.

      While CC3 dimerization, IDR–IBD interactions, and nuclear localization all contribute to condensate formation, our data do not imply that these elements are mechanistically equivalent. Rather, we propose that CC3 provides a structural scaffold, while IDR-mediated interactions lower the energetic barrier for condensation. We have revised the manuscript to clarify this hierarchical model and to avoid implying that all domains contribute equally or independently.

      We agree that quantitative in vitro phase diagrams would provide valuable insight into intrinsic domain contributions. Whereas the MORC2ΔCC3-IBD (1–900) and CC3-IBD (900-1032) fragment fails to induce phase separation, the IDR mix CC3–IBD fragment drives robust phase separation; additionally, phase separation is entirely abrogated in the absence of domain–domain interactions. These observations collectively verify that phase separation is contingent on specific domain combinations and their interactions.

      (b) Similarly, for line 228-231: 'Notably, condensates formed exclusively in the nucleus and not in the cytoplasm of transfected HeLa cells, suggesting that chromatin-associated nuclear factors, such as DNA, may contribute to the nucleation or stabilization of MORC2 condensates.' This is an important observation made by the authors. Since MORC2 readily phase separates in vitro under physiological conditions, it is important to discuss why MORC2 does not make condensates in the cytoplasm (in the case of MORC2deltaNLS). In this regard, how does the concentration of overexpressed EGFP-MORC2 constructs compare with in vitro tested droplets of MORC2?

      We thank the reviewer for highlighting this important conceptual point. Although MORC2 readily undergoes phase separation in vitro under physiological buffer conditions, the absence of condensate formation in the cytoplasm of cells expressing MORC2ΔNLS underscores the importance of the nuclear environment in promoting MORC2 assembly.

      The cytoplasm differs fundamentally from the nucleus not only in overall molecular composition but also in the availability of high-valency scaffolds such as chromatin. We propose that chromatin-associated components, particularly DNA, provide a platform that locally concentrates MORC2 and increases its effective valency, thereby facilitating nucleation or stabilization of condensates in the nucleus. In contrast, the cytoplasm lacks such scaffolds, even when MORC2 is expressed at appreciable levels. In cultured cells, MORC2 is seldom observed in the cytoplasm. While specific experimental contexts may facilitate its cytoplasmic localization, such observations are rarely reported [6]. In transfection-based systems, MORC2 predominantly displays droplet-like behavior in the nucleus. Notably, in endogenous EGFP–MORC2 chimeric mice, we detected punctate MORC2 structures in the neuronal cytoplasm of the brain and spinal cord. The functional significance and biophysical state of cytoplasmic MORC2 remain largely unexplored.

      With respect to protein concentration, while EGFP-MORC2 is robustly expressed in cells, direct comparison between cellular expression levels and the protein concentrations used in vitro is inherently challenging. Importantly, in vitro phase separation is driven by bulk protein concentration under defined conditions, whereas in cells, effective local concentration and interaction valency are strongly shaped by spatial confinement and chromatin association. We have revised the manuscript text to emphasize this distinction and to avoid interpreting nuclear specificity as a purely concentration-dependent phenomenon.

      (c) Lines 227-228: '... CW domain restricts condensate overgrowth or fusion', this inference is based on CTDdeltaCW puncta being larger in size (Figure 3a). However, in Figure 4h MORC2deltaIDRb and MORC2deltaIDRc also result in larger puncta. Making a final conclusion that the CW domain restricts condensate overgrowth or fusion warrants additional investigation.

      We thank the reviewer for pointing out the limitation of our original conclusion. We agree that the enlarged puncta in both CTDΔCW (Figure 3a) indicate that condensate size regulation involves the CW domain was insufficiently rigorous.

      Re-analysis of existing data identifies clear phenotypic disparities between the mutants: MORC2ΔIDRb/ΔIDRc mutants show two distinct phenotypes (reduced puncta number with enlarged size, or unchanged puncta number with uniform enlargement), and their total puncta area per cell is comparable to the WT. By contrast, CTDΔCW mutants display markedly larger puncta relative to the WT. Based on this distinction, we have revised our conclusion to a more cautious formulation: "These observations suggest that the CW domain may participate in regulating initial nucleation size and the exact molecular mechanisms require further investigation."

      (4) MORC2 condensate-mediated gene silencing:

      This is one of the key investigations of this study where the authors evaluate the ability of MORC2 condensates to regulate gene silencing (transcriptional repression). The major concern here is that the authors are drawing their conclusion based on a CC3 domain deletion mutant of MORC2 and comparing it with wild-type MORC2. Notably, the CC3 domain is responsible for MORC2 dimerization, and as the authors quote, 'The dimeric assembly of CC3 is essential for maintaining the structural integrity of the protein', the absence of CC3 would have a direct impact on its function (such as ATPase activity). With these considerations, it is not clear whether the effect of CC3 domain deletion on gene regulation is an effect of no phase separation or a consequence of loss of function. This necessitates additional validation by including other controls, such as IBD domain deletion mutant, IDRa domain deletion mutant, where the phase separation is impeded without affecting dimerization.

      We appreciate the reviewer’s concern regarding the interpretation of CC3 deletion experiments. We agree that CC3 deletion affects both dimerization and phase separation, complicating attribution of gene regulatory effects solely to condensate formation. Our intention was not to claim that loss of repression arises exclusively from impaired phase separation, but rather to demonstrate that disrupting condensate-dynamic capacity correlates with impaired silencing.

      To directly address these concerns, we have performed a series of new experiments specifically designed to decouple condensate formation, condensate dynamics, and protein abundance, thereby allowing us to more rigorously interrogate the functional relevance of MORC2 condensates.

      First, to overcome the limitation of domain deletions which may affect MORC2 function beyond phase separation we introduced a micropeptide-based kill switch (KS) to the C terminus of MORC2. This strategy has recently emerged as a powerful approach to selectively reduce condensate dynamics without disrupting protein expression, folding, or domain architecture [1]. Importantly, unlike CC3 or IDRa deletions, MORC2+KS robustly form nuclear condensates but exhibits markedly reduced internal dynamics, as demonstrated by FRAP analyses showing minimal fluorescence recovery after photo bleaching (Fig. 6a-c). This strategy therefore allows us to perturb condensate material properties independently of MORC2 domain integrity.

      Second, we systematically compared the transcriptional consequences of rescuing MORC2-knockout HeLa cells with MORC2FL, condensation-deficient mutants (ΔCC3 and ΔIDRa), and the dynamics-defective MORC2+KS (Fig. 6d). Despite being expressed at substantially higher levels than MORC2FL (Fig. 6e), all three mutants showed a striking and consistent failure to restore MORC2-dependent transcriptional regulation (Fig. 6f-h). This effect was particularly pronounced for transcriptionally repressed genes, including two sets of high-confidence MORC2 targets reported in prior studies (Fig. 6i and Fig. S10). These findings demonstrate that neither increased protein abundance nor the mere presence of condensate-like structures alone is sufficient to restore MORC2 function.

      Third, our data instead support a model in which both soluble MORC2 complexes and dynamic MORC2 condensates are required for full transcriptional activity. While soluble MORC2 is likely involved in target recognition and complex assembly, our results indicate that proper condensate formation and critically, condensate dynamics are essential for effective transcriptional repression and activation. The inability of the MORC2+KS mutant to rescue transcriptional defects, despite intact condensate formation, points away from a model in which MORC2 condensates represent only microscopically visible byproducts of MORC2 activity.

      We believe these new data strengthen the manuscript by pairing the detailed mechanistic dissection of MORC2 phase separation with direct functional evidence, enhancing the conceptual impact and biological significance of the study.

      (5) Uncertain impact of pathogenic MORC2 mutations:

      Line 356-365: While the statements such as "disease-associated mutations primarily affect enzymatic and phase behaviors rather than DNA affinity" and "these findings provide mechanistic insight into how specific mutations may contribute to distinct pathological outcomes" are conceptually compelling, the data presented in Figure 7b-d do not appear to fully support these conclusions. For many of the mutants, the differences from WT across key parameters-condensation, ATPase activity, and DNA binding-are either modest or statistically insignificant. As such, drawing a unified mechanistic conclusion from these datasets may overstate what the data actually support.

      We agree that the effects of disease-associated MORC2 mutations described in Fig. 7 are modest and, in some cases, statistically insignificant. Our intention was to document observable trends rather than to propose a unified mechanistic framework. We have revised the manuscript to temper these conclusions and to emphasize the descriptive nature of these data.

      (6) Important conceptual clarifications:

      (a) Intrinsically disordered regions (IDRs) are not synonymous with phase separation. As the authors show, it is a combination of IDR-mediated interactions and CC3 dimerization that contributes towards the phase separation of MORC2. While IDRs can act as scaffolds for multivalent weak interactions that may promote biomolecular condensate formation, many IDRs serve other roles-such as mediating transient interactions, signaling, or regulatory functions-without undergoing phase separation. Researchers should avoid generalizing the assumption that the mere presence of IDRs in a protein implies its ability for phase separation. In this regard, authors should consider restructuring some of their generalized statements: Line 87-88: 'Recent studies suggest that intrinsically disordered regions (IDRs) can drive liquid-liquid phase separation (LLPS)' and Line 159-161: 'we noticed a long unstructured region at its C-terminus (Fig. S1b), a characteristic often associated with proteins capable of phase separation'.

      We agree that IDRs are not synonymous with phase separation and have revised the Introduction to avoid generalized statements. The revised text now emphasizes that IDRs can contribute to phase separation in a context-dependent manner and act in concert with structured oligomerization domains such as CC3-IBD.

      (b) Liquid-liquid phase separation: I would suggest switching the phrase to just phase separation. The rationale is that the in vitro studies of MORC2 (FRAP, droplet imaging) do not show liquid-like behavior, but perhaps liquid-solid. The FRAP studies suggest liquid-like behavior for some of the constructs. Given the differences in viscoelastic properties across the in vitro and in cellulo studies, it is better to generalize to "phase separation". Movies for droplet fusion and FRAP, wherever applicable, would be much appreciated. As the nature of in vitro MORC2 droplets appears different than in cells, movie representations of the above would enable readers to better assess the viscoelastic nature of the droplets (whether liquid, gel, etc).

      We appreciate the reviewer’s insight regarding the viscoelastic properties of MORC2. Our experimental data indeed show a disparity in dynamics between the two environments: while in vitro MORC2-FL condensates exhibit relatively low internal mobility, the in cellulo MORC2-FL puncta display high dynamics, characterized by rapid internal recovery in FRAP assays and droplet fusion events (Fig. S2f).

      This contrast suggests that the intracellular microenvironment plays a critical role in regulating the material state of MORC2 condensates. Consequently, we have focused on providing in vivo fusion data, as we believe in vitro characterizations (such as fusion or FRAP under various artificial conditions) may not faithfully represent the physiological behavior of MORC2. We have revised the manuscript to use the more general term “phase separation” or “condensation” and have added a discussion on these limitations to avoid overinterpreting the material properties observed in vitro.

      (7) Methods:

      (a) Figure 6 S2b: If phase separation occurs at, say, 1.8 µM protein concentration, this indicates that the protein has reached its saturation concentration (c-sat). Beyond c-sat, any additional protein should partition into the dense phase, while the concentration of the dilute phase remains constant. However, in this figure, the dilute phase concentration appears to increase with increasing total protein concentration, which is inconsistent with expected phase separation behavior. As the methods section does not have any sub-section for the sedimentation assay, it becomes difficult to understand how this experiment was performed, whether there is any technical discrepancy in the way soluble and pellet fractions were handled and processed for loading onto the gels. This is also the case with Figure 3d.

      We thank the reviewer for carefully examining the sedimentation assay and for raising this important conceptual point. We agree that, for an ideal two-phase system at thermodynamic equilibrium, the concentration of the dilute phase is expected to remain constant once the saturation concentration (c-sat) is reached.

      In our study, the sedimentation assay was used as an operational readout to assess concentration-dependent partitioning rather than to quantitatively define equilibrium phase boundaries. The assay involves centrifugation-based separation of supernatant and pellet fractions followed by SDS–PAGE analysis, and therefore does not necessarily report the equilibrium concentrations of coexisting dilute and dense phases. In particular, this approach can be influenced by incomplete physical separation of phases, kinetic trapping, and redistribution of material during handling, especially in systems where condensate maturation or internal reorganization occurs on longer timescales.

      Consequently, the apparent increase in the supernatant fraction with increasing total protein concentration likely stems from kinetic limitations and inherent technical constraints of the sedimentation assay, rather than a genuine deviation from classical phase separation behavior. These caveats are now explicitly clarified in the Methods section, with similar limitations of centrifugation-based assays for defining equilibrium phase behavior of biomolecular condensates reported previously.

      (b) Figure 4: The NMR comparisons appear to be primarily qualitative, lacking quantitative analyses such as chemical shift perturbation (CSP) and intensity ratio plots, which would offer deeper mechanistic insights. The NMR spectra detailing interactions among the IDR domains need to be quantified.

      We thank the reviewer for the suggestion. We have now performed quantitative CSP analyses for the NMR data shown in Fig. 4, and the corresponding CSP plots have been added to the revised manuscript (Fig. S7).

      As expected for interactions mediated by intrinsically disordered regions involved in phase separation, the observed CSPs are generally small. Notably, the CSP profile of IDRa closely matches that observed for the full-length IDR, whereas IDRb and IDRc show minimal perturbations. These results indicate that the interaction is primarily mediated by IDRa, with little contribution from the remaining regions.

      Peak intensity analyses were also examined but did not reveal additional residue-specific trends. Together, the quantitative CSP data support our conclusion that the interaction is weak, dynamic, and region-specific, consistent with an IDR-driven, phase-separation-related mechanism. We add this statement in method: CSPs were calculated in Hz at 600 MHz using the following equation:

      Minor comments:

      (1) Line 59-60: The Authors mention the HUSH-complex and then the MORC protein family, but do not discuss the relation between the two.

      We thank the reviewer for this comment. We have revised the Introduction to explicitly state that MORC2 may serve as a component of the HUSH complex and to clarify the functional relationship between MORC family proteins and HUSH-mediated transcriptional repression.

      (2) Line 74: 'Despite their structural similarities...', similarities between what all?

      We agree that this statement was ambiguous. We have revised the text to explicitly specify that the comparison refers to structural similarities among MORC family members.

      (3) Line 75: 'MORC-mediated repression remains...', this is the first time the word 'repression' is mentioned in the text and directly as an outstanding question.

      We have revised the Introduction to introduce the concept of transcriptional repression earlier and to provide appropriate context before posing it as an outstanding question.

      (4) The third paragraph does address issues in comments 1 and 3 to some extent, but the introduction needs some restructuring to provide a proper flow of information.

      We agree that the Introduction required restructuring. We have revised this section to improve logical flow, better integrate prior studies, and more clearly articulate the motivation and scope of the present work.

      (5) Line 83-85: How does the presence of IDRs suggest potential regulatory mechanisms?

      We have revised this sentence to clarify that IDRs may contribute to regulatory mechanisms by enabling multivalent and dynamic interactions, rather than implying that IDRs inherently confer regulatory function or phase separation capability.

      (6) Line 106-107: 'To determine whether MORC2 has N- and C-terminal dimerization interfaces similar to those...', reference 14 has already established that CC3 (denoted as CC4 in ref 14) is responsible for dimerization. Consider acknowledging their work in this regard?

      We thank the reviewer for this reminder. We have now explicitly acknowledged Ref. 14, which previously established the role of CC3 (denoted CC4 in that study) in MORC2 dimerization.

      (7) Lines 117-122: Are the authors comparing morphology from negative stain EM with AlphaFold predicted structure (Figure S1a and S1b)? If so, providing a zoomed-in inset from Figure S1a would be helpful.

      Yes, the comparison was intended to relate the negative-stain EM morphology to the AlphaFold-predicted architecture. We have added a zoomed-in inset in Fig. S1a to facilitate clearer comparison.

      (8) Line 152-153: '...even under varying physiological conditions', what are these varying conditions? Are the authors trying to point towards any of their specific results?

      We have revised this phrase to explicitly refer to variations in salt concentration and protein concentration tested in our in vitro assays.

      (9) Line 154-155: 'The dimeric assembly of CC3 is essential for maintaining the structural integrity of the protein', if it has been established, then please provide a reference.

      We thank the reviewer for this suggestion. For MORC family proteins, C-terminal coiled-coil–mediated dimerization is necessary for correct homodimer formation and functional stability (Xie et al., 2019, Cell Commun Signal. 17:160, Ref 14 in the revised manuscript).

      (10) Line 159-161: 'we noticed a long unstructured region at its C-terminus (Figure S1b), a characteristic often associated with proteins capable of phase separation25.', again authors are generalizing a statement which is, in most cases, context-dependent. For example, ref 25 mentions that unstructured regions or IDRs serve as a scaffold for multivalent interactions.

      We agree with the reviewer and have revised this sentence to avoid generalization. The revised text now emphasizes that IDRs may facilitate multivalent interactions in a context-dependent manner, rather than being intrinsically indicative of phase separation. Additionally, we have explicitly cited the mechanistic insight from Reference 25 that IDRs serve as scaffolds for multivalent interactions, to strengthen the logical link between the structural feature and its potential functional relevance.

      (11) Methods section for NMR (Line 665-667) mentions that nucleotides were added to a final concentration of 10 mM. There is no figure or section for MORC2 NMR with added nucleotides/DNA.

      We thank the reviewer for pointing this out. The nucleotide (ATP) addition was part of preliminary NMR trials and is not directly associated with the figures presented. We have deleted this in the Methods section to avoid confusion.

      (12) Line 285-294: Authors compare the effect of DNA binding on the phase separation of both MORC2FL and MORC2 CTDdeltaCW and conclude that DNA-induced condensation is primarily mediated through interactions with the IDR-NLS region. This appears not to be backed by proper control experiments. The authors do not show whether DNA binding mediates any phase separation for the isolated NTD or not? Similarly, what is the effect of DNA binding on MORC2 deltaIDR?

      We thank the reviewer for this insightful comment and agree that additional controls are essential for rigorously dissecting the contribution of DNA binding to MORC2 phase separation. Our interpretation that DNA-enhanced condensation is primarily mediated through the IDR–NLS region was based on comparative analyses of MORC2FL and MORC2 CTDΔCW, together with EMSA results demonstrating that DNA binding activity is conferred by the IDR–NLS–containing region. We acknowledge, however, that DNA binding alone is not sufficient to infer phase separation behavior.

      To address this point, we have performed additional analyses using the isolated NTD’ (residues 1–536) and MORC2 ΔIDR–NLS mutants (Fig. S6). The isolated NTD’ exhibited detectable DNA binding [4] but did not undergo DNA-induced condensation under conditions while MORC2FL or MORC2 CTDΔCW (residues 537-1032) readily formed condensates, indicating that DNA binding by itself is insufficient to drive phase separation. In parallel, MORC2 ΔIDR–NLS mutants showed severely compromised solubility and stability in vitro, which limited their quantitative characterization in phase separation assays. Nevertheless, under the conditions tested, these mutants did not display DNA-enhanced condensation comparable to MORC2FL.

      Taken together, these observations support a model in which the IDR–NLS region plays a critical role in coupling DNA binding to condensation, while additional domains are required to sustain robust phase separation. We have revised the manuscript text to clarify the experimental scope and to avoid overinterpreting the contribution of DNA binding in the absence of fully reconstituted control systems.

      (13) How did the authors assign the backbone amide NMR chemical shifts for MORC2?

      Backbone assignments of MORC2 IBD (1004-1032) were obtained using SOFAST versions of standard triple-resonance experiments, including HNCACB and CBCACONH, recorded at 298 K. Residual assignment ambiguities were resolved using [15] N-edited HMQC-NOESY-HMQC spectra.

      (14) Line 256: 'The partial compaction of IDRa...', what does the author mean here with 'partial compaction'? How did they measure compaction here?

      Regarding the term “partial compaction” mentioned previously, we apologize for the typographical error this phrase was erroneously used in place of “key component”.

      (15) Line 312-315: Why is there even a MORC2 readout for MORC2 KO cells with only EGFP? Also, the authors suggest that IDR deletion may impair mRNA stability or transcription; however, the expression levels of MORC2 deltaIDR and MORC2 deltaCC3 do not appear drastically different in Figure 3a.

      We thank the reviewer for raising these points. The apparent MORC2 signal in MORC2 knockout cells transfected with EGFP alone is due to the presence of residual MORC2 mRNA. Although CRISPR–Cas9–mediated knockout introduces a frameshift that prevents MORC2 protein expression, the mRNA can still be detected by RNA-seq. This is because nonsense-mediated decay (NMD), which targets transcripts with premature stop codons for degradation, is not always 100% efficient. Therefore, some MORC2 transcripts remain and produce detectable RNA-seq reads, even though no functional protein is expressed.

      Regarding the apparent discrepancy in expression levels, Fig. 3a displays only EGFP-positive cells, within which the fluorescence intensity of MORC2ΔIDR and MORC2ΔCC3 appears comparable to that of WT MORC2. However, the overall fraction of EGFP-positive cells is markedly reduced for these mutants compared to WT. Thus, while expression levels among successfully transfected cells are similar, fewer cells express detectable levels of the ΔIDR or ΔCC3 constructs across the total population. We therefore interpret this reduction in EGFP-positive cell fraction as reflecting impaired expression efficiency of these mutants, potentially arising from altered transcriptional output, mRNA stability, or protein stability. We have revised the manuscript text to clarify this distinction and to avoid overinterpreting the underlying mechanism in the absence of direct measurements.

      Author response image 1.

      EGFP, EGFP–MORC2 (FL), EGFP–MORC2 (ΔCC3), and EGFP–MORC2 (ΔIDR) were re-expressed in MORC2-knockout HeLa cells. Confocal imaging revealed that full-length MORC2 formed condensates in the nucleus, whereas mutants lacking either the CC3 or IDR domain failed to exhibit such behavior. Notably, under identical experimental conditions, we observed a marked reduction in the transfection efficiency of the EGFP-MORC2 (ΔIDR) construct. In contrast to the other variants, EGFP signals for ΔIDR were detectable in only a small fraction of the total cell population, despite consistent DNA loading and protocol synchronization. This observation suggests that the IDR might be required not only for biomolecular condensation but also for maintaining the steady-state levels of the MORC2 mRNA/protein or overall cellular fitness.

      (16) Line 330: 'MORC2 deltaCC3 failed to repress any of the 18 downregulated targets...'. This does not appear to be entirely true as repression of some targets (LBH, TGFB2, GADD45A) are closer to MORC2 FL than the EGFP control.

      We thank the reviewer for pointing out this inconsistency and for highlighting the need for precise wording. We have updated the dataset and revised the text to describe the results more accurately. We now describe that the mutants impair MORC2FL-mediated transcriptional regulation, consistent with the overall trend observed across these target genes.

      (17) Line 347-350: Based on the percent of cells with condensates, the authors conclude that CMT2Z-linked E236G and SMA-linked T424R mutants promote MORC2 phase separation. Again, the effect of these mutations on MORC2 condensation in cells may be direct or indirect. This can be investigated by comparing the in vitro effect of these mutations on MORC2 phase separation.

      We thank the reviewer for raising this important point and fully agree that the effects of disease-associated MORC2 mutations on condensate formation in cells may arise from either direct alteration in intrinsic phase separation propensity or indirect influences mediated by the cellular environment.

      In our study, disease-associated MORC2 mutants were assessed for condensate formation in HEK293F cells. Attempts were made to characterize these mutants in vitro; however, the E236G mutant exhibited markedly reduced solubility and stability upon purification, which precluded reliable in vitro phase separation analysis. We therefore evaluated the impact of E236G in cells and found that this mutation significantly impaired the dynamics of nuclear MORC2 condensates. For the T424R mutant, we note that its intracellular condensates displayed FRAP recovery kinetics comparable to those of WT MORC2, suggesting broadly similar dynamic properties of the assemblies formed in cells, but not necessarily implying a direct enhancement of intrinsic phase separation.

      In light of these considerations, we have revised the text in Lines 347–350 to avoid attributing a direct causal role of these mutations in promoting MORC2 phase separation. Instead, we now describe the observed increase in the fraction of cells containing condensates as a descriptive cellular correlation. We further emphasize that systematic in vitro characterization of disease-associated MORC2 mutants will be required to distinguish direct from indirect effects and represents an important direction for future investigation.

      (18) The discussion section lacks referencing to individual figures in the results section as well as previous literature.

      We agree with the reviewer that the Discussion would benefit from clearer integration with both the Results figures and prior literature. In the revised manuscript, we have substantially restructured the Discussion to explicitly reference key figures when interpreting experimental findings and to more clearly distinguish conclusions drawn from specific datasets. In addition, we have expanded citations to previous studies where relevant, particularly in the context of MORC2 DNA binding, ATPase regulation, chromatin association, and disease-linked mutations. These revisions aim to better situate our findings within the existing literature and to guide readers more clearly between experimental observations and their interpretation.

      Reviewer #3 (Public review):

      Summary:

      The manuscript by Zhang et al. demonstrates that MORC2 undergoes liquid-liquid phase separation (LLPS) to form nuclear condensates critical for transcriptional repression. Using a combination of in vitro LLPS assays, cellular studies, NMR spectroscopy, and crystallography, the authors show that a dimeric scaffold formed by CC3 drives phase separation, while multivalent interactions between an intrinsically disordered region (IDR) and a newly defined IDR-binding domain (IBD) further promote condensate formation. Notably, LLPS enhances MORC2 ATPase activity in a DNA-dependent manner and contributes to transcriptional regulation, establishing a functional link between phase separation, DNA binding, and transcriptional control. Overall, the manuscript is well-organized and logically structured, offering mechanistic insights into MORC2 function, and most conclusions are supported by the presented data. Nevertheless, some of the claims are not sufficiently supported by the current data and would benefit from additional evidence to strengthen the conclusions.

      Thank you for your insightful review and constructive suggestions, which have been invaluable in refining our manuscript.

      The following suggestions may help strengthen the manuscript:

      Major comments:

      (1) The central model proposes that multivalent interactions between the IDR and IBD promote MORC2 LLPS. However, the characterization of these interactions is currently limited. It is recommended that the authors perform more systematic analyses to investigate the contribution of these interactions to LLPS, for example, by in vitro assays assessing how the IDR or IBD individually influence MORC2 phase separation.

      We appreciate the reviewer’s insightful comment regarding the characterization of IDR–IBD interactions. In this study, we combined NMR spectroscopy, domain deletion analysis (in vivo), and in vitro phase separation assays to demonstrate that interactions between the IDR and IBD contribute to MORC2 condensate formation. To systematically assess the individual contributions of the IDR and IBD to MORC2 phase separation, we performed in vitro reconstitution assays using purified domain constructs (Fig. S6). Neither the isolated IDR nor the IBD alone exhibited phase separation under buffer conditions approximating the physiological environment, indicating that each domain is individually insufficient to drive condensation. Upon the addition of 10% PEG8000, phase separation was selectively observed for the IDR but not for the IBD, suggesting that the IDR possesses an intrinsic propensity for phase separation that can be enhanced by crowding molecular. Importantly, when the IDR and IBD were mixed, phase separation was robustly induced, supporting a model in which cooperative inter-domain interactions between the IDR and IBD promote MORC2 condensation. In the absence of PEG, no phase separation was observed for the IDR–IBD mixture. These observations imply that IDR–IBD interactions cannot drive phase separation on their own, but require cooperation with CC3-mediated dimerization to achieve this process, which is the central point we wish to emphasize.

      (2) The authors mention that DNA binding can promote MORC2 LLPS. It is recommended that they generate a phase diagram to systematically assess how DNA influences phase separation.

      We agree that constructing a full phase diagram would provide a more systematic evaluation of the effect of DNA on MORC2 phase separation. In the current study, we assessed DNA-dependent condensation across multiple protein and DNA concentrations, which consistently showed that DNA enhances MORC2 phase separation. At low protein concentration (0.5 µM), phase separation requires sufficient DNA, whereas increasing either DNA or protein concentration promotes liquid droplet formation. At high DNA and protein concentrations, amorphous structures dominate, indicating a transition away from dynamic assemblies. We have clarified this point in the Results and Discussion sections and now note that a comprehensive phase diagram analysis represents an important direction for future work.

      (3) The authors use the N39A mutant as a negative control to study the effect of DNA binding on ATP hydrolysis. Given that N39A is defective in DNA binding, it could also be employed to directly test whether DNA binding influences MORC2 phase separation.

      We thank you for your constructive suggestions. The purified wild-type MORC2(1–603) exhibited weak but detectable ATPase activity, whereas the N39A mutant was completely inactive [5]. Based on this characteristic, the N39A mutant was used as a negative control for the ATP-binding-deficient mutant in this study [3]. However, no evidence has been provided to demonstrate that the N39A mutant is defective in DNA binding. Importantly, both our results and previous studies [5-6] indicate that MORC2 engages DNA via multiple domains, suggesting that a single-point mutation is unlikely to significantly compromise its overall DNA-binding capacity.

      (4) Many of the cellular and in vitro LLPS experiments employ EGFP fusions. The authors should evaluate whether the EGFP tag influences MORC2 phase separation behavior.

      We appreciate the reviewer’s concern regarding the potential influence of the EGFP tag. The use of EGFP fusions in our study was primarily to maintain consistency with the in-cell experiments. Importantly, we confirmed that EGFP alone does not undergo phase separation in cells, and this observation is consistent with previous studies [7]. Additionally, in vitro phase separation of MORC2 was independently validated using Cy3–labeled CTD (Fig. S5), which recapitulated the condensate formation seen with EGFP-fused protein. Together, these results indicate that the EGFP tag does not significantly influence MORC2 phase separation, supporting the validity of our conclusions.

      Reviewer #3 (Recommendations for the authors):

      (1) The authors claim to have obtained nucleic acid-free protein, but no data are provided to support this assertion. It is recommended that they include appropriate validation to confirm the absence of nucleic acids.

      We thank the reviewer for highlighting this point. To validate that the purified MORC2 protein is indeed free of nucleic acid contamination, we have additional experimental evidence (e.g., A260/280 measurements, agarose gel analysis, or EMSA in Fig. 5), which has been added to the Methods section and Table S2.

      Note: Agarose gel analysis for MORC2 constructs to confirm the absence of nucleic acids. The pET32 vector as the positive control, the protein preparation for analysis is 0.05 mg. E means E. coli and H means HEK293F.

      (2) The FRAP recovery curves are not normalized to 0, making comparison difficult. The authors should normalize the post-bleach intensity to 0 and re-plot the curves to allow a more standard interpretation of mobile fractions.

      We agree with the reviewer and have now normalized the FRAP recovery curves by setting the post-bleach intensity to 0. The revised plots are presented in the Figures (2f, j, l; 6c, 7f), allowing for more direct comparison of mobile fractions across different conditions.

      (3) The HSQC spectra for IBD appear inconsistent: the peak positions in Fig. 4C do not align with those shown in panels D-F. The authors should verify the spectral assignments and ensure consistency across figures.

      We thank the reviewer for pointing this out. The apparent inconsistency arose from the fact that different spectral regions were displayed in Fig. 4c versus Fig. 4d-f for visualization purposes, which may have given the impression of mismatched peak positions. The spectral assignments themselves are consistent across all panels.

      To avoid confusion, we have now adjusted the spectral window shown in Fig. 4c to match that used in Fig. 4d-f. The revised figure ensures consistent presentation of the same spectral region across all panels.

      Reference:

      (1) Zhang, Y., Stöppelkamp, I., Fernandez-Pernas, P. et al. Probing condensate microenvironments with a micropeptide killswitch. Nature 643, 1107–1116 (2025).

      (2) Fendler NL, Ly J, Welp L, et al. Identification and characterization of a human MORC2 DNA binding region that is required for gene silencing. Nucleic Acids Res.53(4):gkae1273 (2025).

      (3) Tchasovnikarova, I., Timms, R., Douse, C. et al. Hyperactivation of HUSH complex function by Charcot–Marie–Tooth disease mutation in MORC2. Nat Genet 49, 1035–1044 (2017).

      (4) Douse, C. H. et al. Neuropathic MORC2 mutations perturb GHKL ATPase dimerization dynamics and epigenetic silencing by multiple structural mechanisms. Nat Commun 9, 651 (2018).

      (5) Tan, W., Park, J., Venugopal, H. et al. MORC2 is a phosphorylation-dependent DNA compaction machine. Nat Commun 16, 5606 (2025).

      (6) Sánchez-Solana B, Li DQ, Kumar R. Cytosolic functions of MORC2 in lipogenesis and adipogenesis. Biochim Biophys Acta. 1843(2):316-326 (2014).

      (7) Li, C.H., Coffey, E.L., Dall’Agnese, A. et al. MeCP2 links heterochromatin condensates and neurodevelopmental disease. Nature 586, 440–444 (2020).

    1. eLife Assessment

      This study addresses an important question in gustatory neuroscience by developing a machine-learning classifier to identify distinct ingestive orofacial movement subtypes from electromyographic recordings and relating their dynamics to population-level activity in the gustatory cortex. The evidence that transitions in cortical ensemble firing are temporally associated with reorganization of ingestive movement patterns is convincing, though some aspects of the behavioral classification and neural analyses require further validation and clarification. The work provides a technically innovative framework for linking neural state dynamics to the motor expression of taste-guided decisions.

    2. Reviewer #1 (Public review):

      Summary:

      This study investigates how ingestive behaviors are reflected in muscle activity and how these behaviors relate to neural dynamics in the brain. By combining muscle recordings with computational analysis, the authors identify patterns of mouth movements and show that these change over time and align with changes in brain activity. The work suggests that ingestion is not defined by a single action but by coordinated changes across multiple behaviors.

      Strengths:

      (1) Addresses an important and underexplored question about how ingestive behavior is organized.

      (2) Combines behavioral, physiological, and computational approaches creatively.

      (3) Provides a novel framework for quantifying complex ingestive movements.

      (4) Demonstrates a clear temporal relationship between behavior and brain activity.

      Weaknesses

      (1) Behavioral labels rely on video-based scoring, which may not fully capture subtle or hidden movements.

      (2) The relationship between brain activity and behavior is correlational, but sometimes interpreted more strongly.

      (3) The manuscript could be clearer and more accessible to readers outside the field.

    3. Reviewer #2 (Public review):

      Summary:

      In this study, Baas-Thomas et al. aim to study the neural mechanisms underlying ingestive versus rejection responses to taste stimuli by developing an EMG-based approach to identify ingestion-related orofacial movements. Whereas prior work has focused primarily on detecting rejection-related gapes, the authors introduce a machine-learning classifier that uses waveform features extracted from anterior digastric (AD) EMG signals to detect mouth- and tongue-movement (MTM) events associated with ingestion. Clustering analyses further suggest that ingestive behavior consists of multiple MTM subtypes whose relative frequencies vary across trial time and taste conditions. Finally, simultaneous recordings indicate that shifts in MTM expression follow transitions in gustatory cortex (GC) population dynamics into palatability-related firing states, supporting a role for cortical ensemble activity in coordinating ingestive motor responses.

      Strengths:

      Overall, the scientific question addressed in this study is well motivated. A mechanistic understanding of ingestive decision-making requires a precise characterization of the motor patterns that implement ingestion, and these behaviors have remained insufficiently resolved in prior work. The authors take a reasonable and technically innovative approach by leveraging AD EMG recordings to classify distinct orofacial movement patterns. The extracted waveform features appear effective in separating gapes from ingestion-related mouth-tongue movements, and clustering analyses further suggest the presence of distinguishable MTM subtypes that show meaningful temporal structure and neural correlates. Taken together, the work provides a potentially useful framework for linking gustatory cortical dynamics to the motor expression of taste-guided decisions.

      A particularly valuable aspect of this work is the attempt to move beyond a binary characterization of ingestive behavior and instead identify multiple subtypes of ingestion-related movements. This finer behavioral resolution has the potential to provide a more realistic account of how complex consummatory actions are organized. More broadly, the effort to relate structured behavioral motifs to population-level neural dynamics is conceptually interesting and could prove useful for future studies seeking to connect circuit dynamics with the motor implementation of motivated behaviors.

      Weaknesses:

      (1) I have several concerns regarding the methodological comparisons used to establish the superiority of the proposed XGBoost classifier. In particular, the comparison between the XGBoost classifier and previously used QDA approaches (Figure 3) may not be entirely well-matched. The QDA framework was originally designed primarily to detect gape events and does not explicitly assign labels to MTM movements. As a result, the apparent advantage of XGBoost in identifying MTMs may partly reflect differences in task formulation rather than intrinsic differences in classification performance. From visual inspection, gape detection performance appears broadly comparable across methods.

      A more informative benchmark would involve comparing XGBoost to an extended pipeline in which QDA-based gape detection is combined with a secondary movement-detection stage, distinguishing MTMs from periods of no movement. Such a comparison would better isolate the contribution of classifier architecture per se. Without this control analysis, the strength of the claim that XGBoost provides superior performance for behavioral decoding remains somewhat uncertain.

      (2) The presentation of the neural ensemble analyses is considerably less comprehensive and intuitive than that of the behavioral analyses. The manuscript would benefit from more direct visualization of inferred neural state transitions. For example, plotting predicted neural states in a manner analogous to the behavioral states illustrated in Figure 6B would improve interpretability and help readers understand how neural dynamics relate temporally to behavioral changes.

      In addition, the interpretation that GC ensemble dynamics drive behavioral state transitions may require further clarification. If GC activity plays a causal role in initiating behavioral changes, one might expect a consistent brain-to-behavior lag across changepoints. However, Figure 6 appears to show such lag primarily at the second transition but not at the first. This raises questions about how uniformly the proposed causal interpretation applies across state boundaries, and additional analysis or discussion is needed.

      (3) The neural ensemble analyses primarily focus on constructing higher-level behavioral state variables rather than directly testing how individual movement subtypes relate to neural activity. The behavioral interpretation of the inferred state structure, therefore, remains somewhat unclear. While this approach is consistent with previous work from the authors and with broader state-transition frameworks of gustatory processing, it is not immediately obvious that this is the most informative level of analysis for the present dataset.

      In particular, it would strengthen the manuscript to examine whether GC neurons or ensembles also encode lower-level motor structure, such as the occurrence of gapes or specific MTM subtypes. Demonstrating selective or mixed encoding across hierarchical levels (movement motifs versus abstract behavioral states) would help clarify the functional interpretation of the reported neural dynamics. At present, the manuscript largely assumes that GC activity reflects higher-order behavioral states without directly testing alternative representational possibilities.

      (4) Because direct behavioral ground truth for intra-oral ingestive movements is difficult to obtain, MTM subtypes are inferred primarily through clustering of EMG waveform features. Although the authors demonstrate statistical separability and cross-session stability of these clusters, it remains unclear whether they correspond to discrete motor programs or instead reflect a structured partitioning of a continuous behavioral space shaped by feature selection and preprocessing choices. Perhaps some additional robustness analyses or convergent validation (e.g., alternative clustering methods, feature perturbation tests, or stronger neural and behavioral dissociations) would help clarify the biological significance of the inferred subtype structure.

    4. Reviewer #3 (Public review):

      Summary:

      This study examines how ingestive-related orofacial movements relate to ensemble dynamics in gustatory cortex (GC) during taste processing. Previous work has shown that GC activity evolves through a sequence of population states following taste delivery, culminating in a transition to palatability-related firing that precedes rejection-related orofacial movements (e.g., gaping). Importantly, perturbing GC activity around the time of this transition alters the timing of gaping, suggesting that these ensemble dynamics play a functional role in linking taste evaluation to behavioral responses. The present study asks whether similar neural dynamics are also associated with ingestive-related orofacial movements that occur during the consumption of palatable stimuli.

      To address this question, the authors develop a machine-learning classifier to identify distinct orofacial movements from anterior digastric EMG recordings. Using a set of labeled EMG waveforms obtained from video-scored trials, a gradient-boosted (XGBoost) classifier is trained to detect gapes, mouth/tongue movements (MTMs), and periods of no movement. Applying this classifier to a larger EMG dataset reveals that ingestive-related MTMs cluster into three distinct movement subtypes whose frequencies change systematically within trials.

      The authors then relate these behavioral dynamics to previously described GC ensemble transitions identified using changepoint modeling. They report that changes in MTM subtype frequencies tend to occur shortly after the transition to palatability-related activity in GC. These results suggest that GC population dynamics are temporally associated not only with rejection-related behaviors but also with ingestive motor patterns that occur as animals prepare to consume palatable tastants.

      Strengths:

      The study introduces an innovative framework for extracting intricate orofacial movement information from EMG recordings. The machine-learning classifier provides a scalable method for identifying specific orofacial movements and performs better than previously published algorithms designed for gape detection. This approach allows the authors to examine movement microstructure at a temporal resolution that cannot be achieved with video scoring in freely moving animals.

      A second strength is the integration of orofacial movement analysis with neural population dynamics. By relating EMG-derived movement subtypes to ensemble state transitions in GC, the study builds on a substantial body of work examining the temporal evolution of taste responses in cortex. The finding that ingestive-related movement dynamics occur shortly after the emergence of palatability-related firing provides an interesting extension of previous observations linking GC state transitions to rejection behavior.

      The manuscript is also commendable in its commitment to data accessibility. By providing clear information about how the datasets can be accessed and making training data for the classifier publicly available, the authors make it possible for other researchers to examine the analytical pipeline and apply similar approaches to their own datasets. This transparency provides a useful framework for extending and building upon the methods presented here.

      Weaknesses:

      Some aspects of the EMG-based movement classification pipeline warrant careful interpretation. The training dataset used for classifier development is relatively small and is derived from a subset of trials in which mouth movements were clearly visible in video recordings. While the classifier performs well on this labeled dataset, it is not entirely clear how representative these labeled examples are of the full range of EMG signals present in the larger dataset.

      The interpretation of the three identified MTM subtypes also remains somewhat tentative. The study convincingly demonstrates that distinct waveform-defined clusters exist in the EMG data, but the functional significance of these clusters as ingestive "behaviors" is less clear. As acknowledged by the authors, the specific roles of these movement patterns in the ingestion process remain speculative.

      Finally, several conclusions in the Discussion rely on relatively strong mechanistic language when describing the relationship between GC dynamics and ingestive behavior. The data clearly demonstrate a temporal association between GC state transitions and changes in the frequencies of the different MTM subtypes. However, the results primarily support the interpretation that similar cortical dynamics are associated with ingestive and rejection-related behaviors rather than definitively establishing that these behaviors "are governed by the same underlying neural mechanisms".

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      (1) Behavioral labels rely on video-based scoring, which may not fully capture subtle or hidden movements.

      This is very true; certainly, this work is only a starting point. But the techniques used for this manuscript, despite starting with video-based scoring, specifically did allow us to differentiate behaviors that were too subtle to recognize in the video. For the revision, we will describe how this work leads to future studies in which we will be able to explore other means of collecting behavioral labels, potentially directly from simultaneous recordings of multiple muscles.

      (2) The relationship between brain activity and behavior is correlational, but sometimes interpreted more strongly.

      We will comb through the manuscript and make edits to be more precise and technically correct in presenting this relationship, and clarify that our suggestion of a causal link is only indirect and related to previous work (Mukherjee et al. 2019).

      (3) The manuscript could be clearer and more accessible to readers outside the field.

      We will edit the manuscript in multiple places to make technical and field-specific aspects more accessible. As part of this, in appreciation of Reviewer 2’s comments, we will take additional care to elaborate on and clarify our need and interpretation of SHAP values and classifier structure.

      Reviewer #2 (Public review):

      (1) I have several concerns regarding the methodological comparisons used to establish the superiority of the proposed XGBoost classifier. In particular, the comparison between the XGBoost classifier and previously used QDA approaches (Figure 3) may not be entirely well-matched. The QDA framework was originally designed primarily to detect gape events and does not explicitly assign labels to MTM movements. As a result, the apparent advantage of XGBoost in identifying MTMs may partly reflect differences in task formulation rather than intrinsic differences in classification performance. From visual inspection, gape detection performance appears broadly comparable across methods.

      A more informative benchmark would involve comparing XGBoost to an extended pipeline in which QDA-based gape detection is combined with a secondary movement-detection stage, distinguishing MTMs from periods of no movement. Such a comparison would better isolate the contribution of classifier architecture per se. Without this control analysis, the strength of the claim that XGBoost provides superior performance for behavioral decoding remains somewhat uncertain.

      The revision will further clarify that, as the reviewer notes, the primary improvement in XGB classification compared to QDA (in multi-class aggregated metrics) comes specifically from its ability to classify MTMs, and that for gapes, both QDA and XGB perform on par. We will be more explicit about the fact that our goal in constructing the classifier is not to compare “classifier architecture”—not to find the very best classifier possible—but rather to take the next step by generating an instance of a classifier that performs demonstrably better on aggregated orofacial movements. We will update the manuscript to be more clear in our claims in this regard, and how the current XGB classifier can, once validated, be bootstrapped by future techniques (possibly using more informative data sources) to more fully characterize orofacial movements.

      (2) The presentation of the neural ensemble analyses is considerably less comprehensive and intuitive than that of the behavioral analyses. The manuscript would benefit from more direct visualization of inferred neural state transitions. For example, plotting predicted neural states in a manner analogous to the behavioral states illustrated in Figure 6B would improve interpretability and help readers understand how neural dynamics relate temporally to behavioral changes.

      In addition, the interpretation that GC ensemble dynamics drive behavioral state transitions may require further clarification. If GC activity plays a causal role in initiating behavioral changes, one might expect a consistent brain-to-behavior lag across changepoints. However, Figure 6 appears to show such lag primarily at the second transition but not at the first. This raises questions about how uniformly the proposed causal interpretation applies across state boundaries, and additional analysis or discussion is needed.

      We are happy to update the figures (likely by adding another panel to Figure 6) to clearly show inference of neural state transitions, in a manner similar to how we have shown behavioral state transitions in Fig. 6B. In addition, we will do a more comprehensive job of describing and referencing earlier work in which we have unpacked these analyses in greater detail—work that makes it clear why we would predict a lag-relationship for one set of change points and not the other.

      (3) The neural ensemble analyses primarily focus on constructing higher-level behavioral state variables rather than directly testing how individual movement subtypes relate to neural activity. The behavioral interpretation of the inferred state structure, therefore, remains somewhat unclear. While this approach is consistent with previous work from the authors and with broader state-transition frameworks of gustatory processing, it is not immediately obvious that this is the most informative level of analysis for the present dataset.

      In particular, it would strengthen the manuscript to examine whether GC neurons or ensembles also encode lower-level motor structure, such as the occurrence of gapes or specific MTM subtypes. Demonstrating selective or mixed encoding across hierarchical levels (movement motifs versus abstract behavioral states) would help clarify the functional interpretation of the reported neural dynamics. At present, the manuscript largely assumes that GC activity reflects higher-order behavioral states without directly testing alternative representational possibilities.

      The reviewer makes a good point. While previous work from the lab (Li et al. 2016) has assessed the relationship of GC activity with both the onset of gaping (i.e., the behavioral state transition) and individual gapes and found only a relationship with onset of gaping (findings that we now explicitly describe in the revision), we have not performed a similar analysis for MTMs. We will do so and add it to the paper.

      (4) Because direct behavioral ground truth for intra-oral ingestive movements is difficult to obtain, MTM subtypes are inferred primarily through clustering of EMG waveform features. Although the authors demonstrate statistical separability and cross-session stability of these clusters, it remains unclear whether they correspond to discrete motor programs or instead reflect a structured partitioning of a continuous behavioral space shaped by feature selection and preprocessing choices. Perhaps some additional robustness analyses or convergent validation (e.g., alternative clustering methods, feature perturbation tests, or stronger neural and behavioral dissociations) would help clarify the biological significance of the inferred subtype structure.

      We admit (in fact, we have done so in the text) that we are not yet to the point of being able to “split hairs” to this degree (although we, like R2, see that as a goal). In the meantime, we will expand the section of Results text in which we describe the fact that the clustering of behaviors is observed both in “waveform space” (Fig. 4E was generated using standardized waveforms) and “feature space” (Fig. 4 B,C, and F), and that as such the clusters are NOT simply a partitioning of continuous, unimodal behavioral space. We will report convergent results from alternative (k-means) clustering methods to further support that conclusion. Finally, we will describe (in the Discussion section) ways to more rigorously test and extend this claim in future work.

      Reviewer #3 (Public review):

      Some aspects of the EMG-based movement classification pipeline warrant careful interpretation. The training dataset used for classifier development is relatively small and is derived from a subset of trials in which mouth movements were clearly visible in video recordings. While the classifier performs well on this labeled dataset, it is not entirely clear how representative these labeled examples are of the full range of EMG signals present in the larger dataset.

      Very good point. We will update the text to note this qualification to the reader. We will also, however, highlight the fact that our focus on a highly reliable and representative (i.e., agreed upon by 2 independent, blind scorers) subset of labels allows us to perform more targeted analyses and make more targeted interpretation in our results. And we will also be more pointed in the revision, as we have noted above, about the fact that this work is only scratching the surface of what can be accomplished in this domain, and that future work will involve STARTING with the waveforms that aren't accounted for in terms of gapes and MTMs.

      The interpretation of the three identified MTM subtypes also remains somewhat tentative. The study convincingly demonstrates that distinct waveform-defined clusters exist in the EMG data, but the functional significance of these clusters as ingestive "behaviors" is less clear. As acknowledged by the authors, the specific roles of these movement patterns in the ingestion process remain speculative.

      We share R3’s desire for clarity on this point—we do not wish to imply that we understand more than we understand—and will be sure to fine-tune our language to make clearer and more explicit the fact that the distinction in the roles of the MTM subtypes in ingestion at this point remains speculative.

      Finally, several conclusions in the Discussion rely on relatively strong mechanistic language when describing the relationship between GC dynamics and ingestive behavior. The data clearly demonstrate a temporal association between GC state transitions and changes in the frequencies of the different MTM subtypes. However, the results primarily support the interpretation that similar cortical dynamics are associated with ingestive and rejection-related behaviors rather than definitively establishing that these behaviors "are governed by the same underlying neural mechanisms".

      We will soften our language to clarify which of our Discussion suggestions are speculation, highlighting for the reader the fact that our data, while consistent with evidence suggesting a causal link between the GC transition and gaping (Li et al., 2016; Mukherjee et al., 2019), do not prove a causal neural-behavioral link for MTMs.

      References:

      Li, Jennifer X., et al. “Sensory Cortical Activity Is Related to the Selection of a Rhythmic Motor Action Pattern.” The Journal of Neuroscience, vol. 36, no. 20, May 2016, pp. 5596–607. DOI.org (Crossref), https://doi.org/10.1523/JNEUROSCI.3949-15.2016.

      Mukherjee, Narendra, et al. “Impact of Precisely-Timed Inhibition of Gustatory Cortex on Taste Behavior Depends on Single-Trial Ensemble Dynamics.” eLife, edited by Laura L. Colgin et al., vol. 8, June 2019, p. e45968. eLife, https://doi.org/10.7554/eLife.45968.

    1. eLife Assessment

      This fundamental study provides a major contribution to our understanding of Amyotrophic Lateral Sclerosis (ALS) pathogenesis by utilizing a primate model that overcomes the historical limitations of rodent paradigms. By demonstrating the retrograde and trans-synaptic spread of pathological TDP-43 from the periphery to the spinal cord and motor cortex, the authors propose a new model for the disease spreading. The evidence supporting these findings is compelling, characterized by rigorous post-mortem histological observations. This work will be of profound interest to neuroscientists and translational researchers seeking to decode the mechanisms of systemic disease progression in ALS.

    2. Reviewer #1 (Public review):

      Summary:

      The authors have used a macaque (two animals only) to follow the migration of 'seeded' TDP43 protein in neuronal pathways - thus mimicking the spread of ALS in the human CNS. Previous experiments in rodents failed to demonstrate this, posing interesting and important biological differences, possibly related to the UMN-LMN system in higher order apes and humans.

      Strengths:

      An important step forward.

      Weaknesses:

      No weaknesses were identified by this reviewer. Only 2 animals were used, but that is appropriate given the sensate status of the macaque. In the opinion of this reviewer, the results are entirely convincing.

    3. Reviewer #2 (Public review):

      Summary:

      There are astonishingly few papers trying to reproduce the process of initiation and spreading that Braaks studies have suggested and postulated. The authors should be applauded for pioneering such a difficult experiment. They overexpressed the TDP-43 protein in the motor neuron pool of the brachioradialis muscle and showed that by this technique, motor neurons in this pool died, and the muscle got denervated. They had evidence of a spreading process from the spinal cord to the cortex, demonstrated by showing widespread deposits of phosphorylated TDP-43 bilaterally in the cervical cord and the motor cortex. By their experiment, they created a dying-backwards model, not a model of corticofugal spread, like that shown by Braak. No muscle weakness was observed, not even in the brachioradialis.

      Strengths:

      The strength of this innovative study is the fact that this spreading experiment uses the phylogenetically young connectome of primates (macaques). They also made the thought-provoking observation of spreading from the cord to the motor cortex, not the corticofugal spread model observed by Heiko Braak. This is thought-provoking because this enables the observer to compare their model with the findings in humans.

      Weaknesses:

      The following aspects are not a weakness but need to be better explained for the interested reader - and potentially improved in future studies for which the authors laid the foundation:

      (1) Why do the authors use the brachioradialis motor neuron pool to overexpress TDP-43? More is known about other muscles and how they are embedded in the motor connectome of primates. Why not the biceps brachii or the hand extensors or - even better - the small muscles of the hand? These are known to be strongly monosynaptically connected with the motor cortex. The authors should explain this. I am unclear if there was a specific reason which I did not see or understand. In my view, the brachioradialis is not the best representative of the primate connectome, for example, to examine this model and compare it with the corticofugal spread.

      (2) In the Braaks experiment, only (seemingly soluble) non-phoshorylated TDP-43 "crossed" synapses. Phosphorylated TDP-43 did not do this. The authors of this study saw phosphorylated TDP43 in motor neurons and the cortex. Is there any potential explanation for how it crosses synapses? If it really does, there is an obvious difference to the human situation which needs to be emphasized and explained (in the future).

      (3) There were significant deposits of phosphorylated TDP-43 in oligodendrocytes in humans. Whilst I understand that one experiment cannot solve every question - I am curious about whether the authors saw anything in oligodendrocytes?

      (4) Which was the pattern of damage? Of course, this pattern is not likely to have a monosynaptic pattern - like in humans........but was there a pattern? Did it have a physiologically meaningful basis? Was there any relation to the corticofugal monosynaptic pattern? What are the differences? The authors speak of "multiple waves". Does this mean that if this were a corticofugal model, for example, oculomotor neurons would also degenerate?

    4. Reviewer #3 (Public review):

      Summary:

      In this paper by Jones and colleagues, a non-human primate model is described in which wild-type TDP-43 is expressed in the cervical spinal cord. This gave rise to loss of motor neurons in the ventral horn at that level in the cervical spinal cord. MRI of the muscles allowed to see increased intensity in the mostly affected brachioradialis muscle, suggesting this muscle becomes denervated. At the neuropathological level, TDP-43 and pTDP-43 staining in the cytoplasm is increased, not only at the specific level of the cervical spinal cord, but also at a distance.

      Strengths:

      A clear strength is the state-of-the art focal expression of the TDP-43 transgene at a focal site in the cervical spinal cord. This is achieved by combining a general expression of a flipped loxP flanked TDP-43 vector using AAV9 intrathecal administration, followed by an intramuscular AAV2 hSyn CRE-TdTomato vector in the brachioradialis muscle in order to induce focal recombination and expression of TDP-43 in motor neurons innervating this muscle on one side.

      Another strength is the non-human primate background, which is much closer to the human situation.

      Weaknesses:

      Given the complexity and cost of the model, the n is very low.

      The design of the experiments and the results shown about the toxicity induced by this focal TDP-43 expression do not allow us to conclude that it is a good model for ALS for several reasons. It is not clear that the TDP-43 overexpression results in spreading weakness or in spreading motor neuron loss. The neuropathological changes described suggest that there is a kind of stress response, which extends to regions away from the site of primary damage, but more is needed to provide convincing evidence that there is spreading of disease pathology reminiscent of human ALS.

    5. Reviewer #4 (Public review):

      Summary:

      In this manuscript, the authors present data describing the development of a model of ALS in rhesus macaques. They use a viral intersectional model to overexpress TDP-43 in a population of motor neurons and then study the spread of the pathology about 7 months later. They demonstrate that both the cervical spinal cord and motor cortex (new and old M1) are full of TDP-43, suggesting that the pathology spreads from the single motor pool to presumably related neurons.

      Strengths:

      This is a super-important study in two main ways:

      (1) This could be the birth of a really important model, one that is really needed for making progress in understanding ALS and the development of therapeutics. There are shortfalls with all the rodent models. Models dependent on cell cultures are superb for understanding cell-autonomous processes, but miss out on connectivity, particularly the long-range connectivity. Organoids may ultimately prove to be beneficial, but they would need cortex, spinal cord, and muscle, and translatability from them is not assured. So a NHP model is needed, and this may be it. Furthermore, the Methods are meticulously described and will undoubtedly facilitate reproducibility.

      (2) The concept of the spread of pathology has been proposed for some time, I think, based initially on the detailed clinical observations of Ravits and colleagues. The authors have looked at this directly and provide supporting evidence for this interesting hypothesis. They show spread locally and contralaterally in the spinal cord (although a figure would be nice) and to the motor cortex.

      Taking only these 2 points into account is more than sufficient for me to be enthusiastic about this work.

      Weaknesses:

      I'd like to make a couple of points that if addressed, could, in my view, help the authors strengthen this work.

      (1) We don't know how many MNs were transduced by the rAAV. There was no tdTom expression, for whatever reason. The authors show an image of a control experiment with a single MN transduced, but there should be a red motor pool, at least in the control experiments. The impression that I get is that very few were transduced, and, in my mind, this makes the findings even more interesting - maybe you don't need many "starter" MNs.

      (2) Continuing on this point, this leads the authors to conclude that all BR MNs have died. They support this by the reduced MN count (see point 3). Firstly, do we know how many BR MNs there are in the rhesus macaque, and does the reduction seen correspond to this number? Secondly, and more importantly, the muscle looks normal on MRI at 28 weeks - it does not look like a denervated muscle. The authors state that it has maybe been reinnervated, but by what, if all the BR MNs are dead? This does not seem like a plausible explanation to me. Muscle histology, NMJs, and fibre typing would have been useful to understand what's going on with the MNs. (And electrophysiology would have been wonderful, but beyond the scope of this study.)

      (3) Some MN biologists, like me, fuss a lot about how to count MNs, which is almost as difficult as counting the number of angels on the head of a pin. Every method has its problems. Focusing on the two methods here: (a) ChAT immunohistochemistry is pretty good in healthy states, but we don't know what happens to ChAT expression in different diseases, particularly when you have a new model. If its expression is decreased, then it is not a good marker for MNs; (b) Identifying MNs based on the size and morphology of neurons in the ventral horn is also insufficient. For example, ~30% of neurons in a typical pool are small gamma MNs, and a significant proportion (depending on the muscle) of the remainder will be small alpha MNs. So what one is counting is, at best, the large alpha MNs, not all the MNs in a pool. And in ALS, it's these largest MNs that are affected at the earliest stages. The small ones might be fine. So results will be skewed. (Hence, it would be interesting to see if the muscle had a higher proportion of Type I fibres after being reinnervated by S-type MNs.)

      (4) Statistics. These are complex experiments looking at the spread of a disease. The experimental unit is therefore the monkey, n=2. In each monkey, multiple sections are analysed, which are key technical replicates and often summative. For example, do we care about the average cell number in Figures 4D, E, 5 I, J or 6G, H, or rather the total cell number? Do the error bars mean anything? To be clear, I am by no means minimising the importance of the overall convincing findings. But I do not think this statistical analysis is particularly meaningful.

    6. Author response:

      Public Reviews:

      Reviewer #1 (Public review): 

      Summary: 

      The authors have used a macaque (two animals only) to follow the migration of 'seeded' TDP43 protein in neuronal pathways - thus mimicking the spread of ALS in the human CNS. Previous experiments in rodents failed to demonstrate this, posing interesting and important biological differences, possibly related to the UMN-LMN system in higher order apes and humans. 

      Strengths: 

      An important step forward. 

      Weaknesses: 

      No weaknesses were identified by this reviewer. Only 2 animals were used, but that is appropriate given the sensate status of the macaque. In the opinion of this reviewer, the results are entirely convincing. 

      Reviewer #2 (Public review): 

      Summary: 

      There are astonishingly few papers trying to reproduce the process of initiation and spreading that Braaks studies have suggested and postulated. The authors should be applauded for pioneering such a difficult experiment. They overexpressed the TDP-43 protein in the motor neuron pool of the brachioradialis muscle and showed that by this technique, motor neurons in this pool died, and the muscle got denervated. They had evidence of a spreading process from the spinal cord to the cortex, demonstrated by showing widespread deposits of phosphorylated TDP-43 bilaterally in the cervical cord and the motor cortex. By their experiment, they created a dying-backwards model, not a model of corticofugal spread, like that shown by Braak. No muscle weakness was observed, not even in the brachioradialis. 

      Strengths: 

      The strength of this innovative study is the fact that this spreading experiment uses the phylogenetically young connectome of primates (macaques). They also made the thought-provoking observation of spreading from the cord to the motor cortex, not the corticofugal spread model observed by Heiko Braak. This is thought-provoking because this enables the observer to compare their model with the findings in humans. 

      Weaknesses: 

      The following aspects are not a weakness but need to be better explained for the interested reader - and potentially improved in future studies for which the authors laid the foundation: 

      (1) Why do the authors use the brachioradialis motor neuron pool to overexpress TDP-43? More is known about other muscles and how they are embedded in the motor connectome of primates. Why not the biceps brachii or the hand extensors or - even better - the small muscles of the hand? These are known to be strongly monosynaptically connected with the motor cortex. The authors should explain this. I am unclear if there was a specific reason which I did not see or understand. In my view, the brachioradialis is not the best representative of the primate connectome, for example, to examine this model and compare it with the corticofugal spread. 

      The brachioradialis muscle was chosen primarily for reasons of animal welfare; our concern when designing the experiments was that the muscle we chose for injection might become very wasted and weak before the experiment had been completed. If we had injected a hand muscle, this would have affected manipulation, feeding and grooming behaviours, whereas had we injected biceps brachii or forearm extensors, this would have affected more important behaviours requiring strength for body support in the home cage (e.g. climbing, swinging, etc.). The advantage of choosing brachioradialis is that there is some functional redundancy; in macaques, compared to biceps brachii, brachioradialis has a relatively minor role in elbow flexion and supination of the forearm. We therefore reasoned that there should be physiological compensation for any weakness in brachioradialis, and thus minimal effects on normal behaviour.

      A secondary practical consideration was the importance of good quality MR imaging of the injected muscle and the positioning of the focussing coil; because of the physical constraints related to the monkey sitting in our narrow-bore scanner, the forearm muscles were the optimal choice. 

      With reference to the ‘primate connectome’, whilst hand muscles are known to have strong cortico-motoneuronal connections, we have shown previously that monosynaptic corticomotoneuronal connections are as strong in muscles innervated by the deep radial nerve (like brachioradialis) as in intrinsic hand muscles (Witham et al, 2016).

      Finally, for the purposes of these experiments, all we required was a method for inoculating TDP-43 into a motor neuron pool within the spinal cord, without direct surgical trauma to the spinal cord. Our aim was to test the hypothesis that extracellular TDP-43 is sufficient to cause spreading neuronal changes in macaque, similar to those observed in human ALS/MND; our aim was not to replicate the actual pattern of human MND observed clinically.

      These points will be addressed in a revised version of the manuscript. 

      (2) In the Braaks experiment, only (seemingly soluble) non-phoshorylated TDP-43 "crossed" synapses. Phosphorylated TDP-43 did not do this. The authors of this study saw phosphorylated TDP43 in motor neurons and the cortex. Is there any potential explanation for how it crosses synapses? If it really does, there is an obvious difference to the human situation which needs to be emphasized and explained (in the future). 

      To clarify, there was no evidence of phosphorylated TDP-43 crossing synapses. It is more likely that excess non-phosphorylated TDP-43 crossed synapses, and that this then subsequently led to TDP-43 phosphorylation.  

      (3) There were significant deposits of phosphorylated TDP-43 in oligodendrocytes in humans. Whilst I understand that one experiment cannot solve every question - I am curious about whether the authors saw anything in oligodendrocytes? 

      We have not looked at this.

      (4) Which was the pattern of damage? Of course, this pattern is not likely to have a monosynaptic pattern - like in humans........but was there a pattern? Did it have a physiologically meaningful basis? Was there any relation to the corticofugal monosynaptic pattern? What are the differences? The authors speak of "multiple waves". Does this mean that if this were a corticofugal model, for example, oculomotor neurons would also degenerate? 

      The description of ‘multiple waves’ in paragraph 2 of the discussion section is entirely hypothetical, based on the assumption that there are different mechanisms by which TDP-43 spreads through the nervous system, from slow local spread by diffusion to more rapid long-range axonal spread to widely separated regions. For the neuropathological staging analysis, we therefore looked at different brain regions (hypoglossal nuclei, reticular formation, inferior olives, frontal cortex, temporal cortex and hippocampal formation). This analysis only showed loss of motor neurons in the spinal cord ipsilateral to the side of the muscle injections, in segments consistent with the location of brachioradialis motoneurons. We did not demonstrate a ‘pattern of damage’ as described in humans in our experiments because this is a pre-symptomatic pre-clinical model, with no established ‘damage’ from each wave. We speculate that this is because animals were terminated too early in the disease process.

      However, whilst there was no established neuronal degeneration outside the cervical spinal cord, the observation that there were more pTDP-43 positive Betz cells in left (contralateral to the brachioradialis injection) New M1 than Old M1 (see Figure 6I and J) would support spread via monosynaptic connections to motoneurons; New M1 is where most monosynaptic cortico-motoneuronal connections originate.

      Reviewer #3 (Public review): 

      Summary: 

      In this paper by Jones and colleagues, a non-human primate model is described in which wild-type TDP-43 is expressed in the cervical spinal cord. This gave rise to loss of motor neurons in the ventral horn at that level in the cervical spinal cord. MRI of the muscles allowed to see increased intensity in the mostly affected brachioradialis muscle, suggesting this muscle becomes denervated. At the neuropathological level, TDP-43 and pTDP-43 staining in the cytoplasm is increased, not only at the specific level of the cervical spinal cord, but also at a distance. 

      Strengths: 

      A clear strength is the state-of-the art focal expression of the TDP-43 transgene at a focal site in the cervical spinal cord. This is achieved by combining a general expression of a flipped loxP flanked TDP-43 vector using AAV9 intrathecal administration, followed by an intramuscular AAV2 hSyn CRE-TdTomato vector in the brachioradialis muscle in order to induce focal recombination and expression of TDP-43 in motor neurons innervating this muscle on one side. 

      Another strength is the non-human primate background, which is much closer to the human situation. 

      Weaknesses: 

      Given the complexity and cost of the model, the n is very low. 

      As is common in most studies in non-human primates, we have carried out all statistical analysis within one animal (e.g. the comparison of motoneuron numbers between left and right cord). We then show that results are reproducible in two animals. Although the number of animals is lower than in a typical rodent study, we see this as an advantage of the model, adhering to the 3Rs principle of ‘reduction’.

      The design of the experiments and the results shown about the toxicity induced by this focal TDP-43 expression do not allow us to conclude that it is a good model for ALS for several reasons. It is not clear that the TDP-43 overexpression results in spreading weakness or in spreading motor neuron loss. The neuropathological changes described suggest that there is a kind of stress response, which extends to regions away from the site of primary damage, but more is needed to provide convincing evidence that there is spreading of disease pathology reminiscent of human ALS. 

      As already noted in our response to Reviewer 2 (point 1), animal welfare is an important consideration when designing these complex experiments in primates. We could not therefore justify allowing the animals to survive until extensive wasting and weakness were evident, recapitulating the human disease. 

      The model developed in these experiments is therefore a pre-symptomatic pre-clinical model, in which animals are terminated before pathology leading to widespread motor neuron loss is evident. At post mortem we do have evidence of motor neuron loss in the segments supplying brachioradialis (C4-C8).

      Stress of various forms, including blunt trauma (e.g. Anderson et al, 2021), stab/electrode insertion injury (e.g. Zambusi et al, 2022), chemical (e.g. arsenite) exposure (e.g. Huang et al, 2024), or hypoxia (Marcus et al, 2021) can result in pathological nucleocytoplasmic translocation of TDP-43. In our model, there was no direct trauma to the brain or spinal cord ante mortem, excluding one major cause of tissue stress. Hypoxia during the process of euthanasia is possible, but we would expect there would not be enough time before death for this to manifest as TDP-43 translocation. In the literature TDP-43 translocation due to stress is diffuse; we have demonstrated that in our model the TDP-43 pathology is not diffuse but selective. For example, there was no evidence of disease in the oculomotor nuclei; in the primary motor cortex (M1) there are significantly more pathological changes in the evolutionarily younger ‘NewM1’ compared to the neighbouring ‘OldM1’.

      It is therefore improbable that our findings could be explained by ‘a kind of stress response’. Our findings are better explained by spread of the TDP-43 protein.

      Reviewer #4 (Public review): 

      Summary: 

      In this manuscript, the authors present data describing the development of a model of ALS in rhesus macaques. They use a viral intersectional model to overexpress TDP-43 in a population of motor neurons and then study the spread of the pathology about 7 months later. They demonstrate that both the cervical spinal cord and motor cortex (new and old M1) are full of TDP-43, suggesting that the pathology spreads from the single motor pool to presumably related neurons. 

      Strengths: 

      This is a super-important study in two main ways: 

      (1) This could be the birth of a really important model, one that is really needed for making progress in understanding ALS and the development of therapeutics. There are shortfalls with all the rodent models. Models dependent on cell cultures are superb for understanding cell-autonomous processes, but miss out on connectivity, particularly the long-range connectivity. Organoids may ultimately prove to be beneficial, but they would need cortex, spinal cord, and muscle, and translatability from them is not assured. So a NHP model is needed, and this may be it.

      Furthermore, the Methods are meticulously described and will undoubtedly facilitate reproducibility. 

      (2) The concept of the spread of pathology has been proposed for some time, I think, based initially on the detailed clinical observations of Ravits and colleagues. The authors have looked at this directly and provide supporting evidence for this interesting hypothesis. They show spread locally and contralaterally in the spinal cord (although a figure would be nice) and to the motor cortex. 

      Taking only these 2 points into account is more than sufficient for me to be enthusiastic about this work. 

      Weaknesses: 

      I'd like to make a couple of points that if addressed, could, in my view, help the authors strengthen this work. 

      (1) We don't know how many MNs were transduced by the rAAV. There was no tdTom expression, for whatever reason. The authors show an image of a control experiment with a single MN transduced, but there should be a red motor pool, at least in the control experiments. The impression that I get is that very few were transduced, and, in my mind, this makes the findings even more interesting - maybe you don't need many "starter" MNs. 

      Unfortunately, we cannot know how many motoneurons were transduced.

      However, the reviewer may be correct, that it is actually only a small fraction of the brachioradialis pool. This is supported by the evidence for rather focal denervation seen on MRI.

      (2) Continuing on this point, this leads the authors to conclude that all BR MNs have died. They support this by the reduced MN count (see point 3). Firstly, do we know how many BR MNs there are in the rhesus macaque, and does the reduction seen correspond to this number? Secondly, and more importantly, the muscle looks normal on MRI at 28 weeks - it does not look like a denervated muscle. The authors state that it has maybe been reinnervated, but by what, if all the BR MNs are dead? This does not seem like a plausible explanation to me. Muscle histology, NMJs, and fibre typing would have been useful to understand what's going on with the MNs. (And electrophysiology would have been wonderful, but beyond the scope of this study.) 

      To clarify, we did not conclude that all brachioradialis motor neurons had died, rather that all transfected brachioradialis motor neurons pool had died. As noted above, when these cells die and the muscle is denervated, the MRI signal changes occupy only a small volume of the muscle and are transient. We would not expect to see long-term MRI changes in muscle anatomy after this limited denervation-reinnervation event. 

      Analysis of muscle histology, including fibre typing, is outwith the scope of this initial paper reporting the model; we hope that this will form the basis of a future publication.

      (3) Some MN biologists, like me, fuss a lot about how to count MNs, which is almost as difficult as counting the number of angels on the head of a pin. Every method has its problems. Focusing on the two methods here: (a) ChAT immunohistochemistry is pretty good in healthy states, but we don't know what happens to ChAT expression in different diseases, particularly when you have a new model. If its expression is decreased, then it is not a good marker for MNs; (b) Identifying MNs based on the size and morphology of neurons in the ventral horn is also insufficient. For example, ~30% of neurons in a typical pool are small gamma MNs, and a significant proportion (depending on the muscle) of the remainder will be small alpha MNs. So what one is counting is, at best, the large alpha MNs, not all the MNs in a pool. And in ALS, it's these largest MNs that are affected at the earliest stages. The small ones might be fine. So results will be skewed. (Hence, it would be interesting to see if the muscle had a higher proportion of Type I fibres after being reinnervated by S-type MNs.) 

      This is an interesting point, and we agree that each method used to quantify MN number carries its own limitations. The problem of MN identification is heightened in a MND-like pathological state, especially when considering evidence of reduced ChAT activity in spinal motoneurons in end-stage disease in post mortem human samples (Oda et al, 1995), and more recent evidence from Casas et al. (2013), who demonstrated early presymptomatic reduction in ChAT expression in SOD1G93A mice. It is important to note that this was a modest reduction, not complete abolition of signal (76% of control levels). ChAT immunoreactivity was still present and motor neurons were still identifiable as ChAT-positive at this pre-clinical stage of disease. As counts in our study were performed based on detecting ChAT in cells, it seems unlikely that we would miss cells. However, we cannot rule this out. If indeed this did occur, it would mean that the reduced motoneuron counts which we observed reflect not only cell death, but also profound motoneuron dysfunction which is presumably the proximal precursor to cell death.

      We acknowledge that size-based criteria applied to ChAT-positive neurons will preferentially capture large alpha motor neurons, and that gamma motor neurons and small alpha motor neurons are likely underrepresented in our counts. Our counts therefore reflect the large alpha motor neuron population rather than the total motor neuron pool. We believe that this is not a critical limitation in the context of the present study. Large alpha motor neurons are the population of primary pathological interest in ALS and related MND, being the earliest and most severely affected subtype. The selective vulnerability of fast-fatigable large alpha motor neurons in ALS is well established, and their preferential loss is the defining feature of disease progression in both human post mortem tissue and rodent models (Lalancette-Hébert et al., 2016). In this respect, our size threshold selects for precisely the population whose degeneration is most relevant to the disease phenotype we are modelling. 

      We intend to include comments on these important points in the revised version of the manuscript.

      In response to the final point regarding muscle histology and proportions of Type I fibres, as stated above, reporting of muscle histology, including fibre typing, is planned for a separate publication.

      (4) Statistics. These are complex experiments looking at the spread of a disease. The experimental unit is therefore the monkey, n=2. In each monkey, multiple sections are analysed, which are key technical replicates and often summative. For example, do we care about the average cell number in Figures 4D, E, 5 I, J or 6G, H, or rather the total cell number? Do the error bars mean anything? To be clear, I am by no means minimising the importance of the overall convincing findings. But I do not think this statistical analysis is particularly meaningful. 

      Here, the experimental unit is the tissue slice, mounted on a slide for histological analysis, and not the monkey. All statistical comparisons are made within a single animal. We then show that the findings can be replicated in two animals, both of which show significant results. This is standard approach taken in primate neuroscience, given the need to reduce animal numbers to the minimum consistent with producing convincing results.

    1. eLife Assessment

      This study presents a useful array of analyses of the effects of training and/or instruction to use the method of loci during episodic encoding and retrieval. A major strength of the experiment is the impressive recruitment of memory athletes and the training of novice athletes to use the method of loci, long known to improve the precision of memory recall. That said, the sheer number of results and their organization should be addressed; streamlining the results and placing them, whenever possible, in a theoretical framework. As it stands, the presented work is incomplete with respect to the major conclusions that training itself leads to neural differentiation of prefrontal cortical neural patterns, and the authors need to temper these claims.

    2. Reviewer #1 (Public review):

      Summary:

      The question of how or whether "extensive memory training affects neocortical memory engrams" (to use the words of the authors) is an interesting question and an area where I think there is room for advancing current knowledge. That said, I do not think the current paper succeeds in meaningfully addressing this question. At a conceptual level, I really struggled with the predictions and interpretations of the findings. There are also several elements of the experimental paradigm and analysis decisions that feel incompatible with the claims that are made. While the manuscript does demonstrate that several measures of neural pattern similarity differ between the various groups of individuals, the issue is that it is difficult to draw clear conclusions from these findings.

      Strengths:

      (1) This is a very unique dataset. Being able to recruit and enroll high-level memory athletes is impressive.

      (2) In principle, comparing memory athletes to control subjects, active control subjects (who received working memory training), and trained subjects (who received method of loci training) is very appealing.

      (3) In several ways, the authors were rigorous in their analyses.

      (4) In principle, the question of how memory training influences neural similarity vs. dissimilarity is of potential interest.

      Weaknesses:

      (1) As far as I can tell, the training manipulation is fully confounded with instructions. That is, subjects were only instructed to use the method of loci if they had completed method of loci training (or if they were the memory athletes). For the training group, in the pre-training session, there was no strategy instruction (subjects could do whatever they wanted), but post-training, they were told to use the method of loci. I understand the argument, of course, that naïve subjects might not be very good at using the method of loci if they had no experience with it. But, it does seem entirely possible that some (or even many) of the observed fMRI results that are attributed to "extensive training" are better explained by strategy use. That is, maybe the effects can be explained by TRYING to use the method of loci as opposed to actual proficiency with the method of loci. It seems impossible to address this, given the design of the experiments. As such, any claims about the effects of memory training, per se, feel inappropriate. It feels equally plausible that the effects are due to the strategy instruction. If the same results could be obtained through a simple strategy manipulation without ANY training at all, that would radically alter the interpretation of the effects. I think the strategy use account is, in fact, quite viable because it is very easy to improve subjects' memories with a method of loci instruction (relative to no strategy instruction) without ANY practice at all. Obviously, practice does improve memory performance with the method of loci, but my point is that even without any meaningful practice, there is likely to be SOME immediate benefit to adopting the method of loci as a strategy. There is also the question of why the effects for the memory athletes weren't obviously stronger than for the trained group, given that the memory athletes have much more experience with the method of loci. Ultimately, the problem with the current design is that I don't see how one can tease apart the role of training, per se, vs. strategy use.

      (2) There is no clear theoretical framework for the predictions or interpretations. The Results section is mostly a list of lots of different permutations of analyses (similarity within a group, between groups, between trials, across trials between subjects, during encoding vs. retrieval, frontal vs. hippocampal vs. parietal ROIs, etc). For each analysis, I did not have an intuition for what the prediction should be (e.g., should athletes have higher or lower pattern similarity?), and even after seeing all the results, I still do not have an intuition for how to interpret them. For the main results related to dissimilarity in prefrontal cortex, I would have, if anything, predicted the opposite: that when individuals are trained to use a common strategy, there would be MORE similarity between them. The Discussion acknowledges a very wide range of possible factors that might contribute to measures of similarity/dissimilarity, but I am ultimately left feeling that I have no idea how to interpret the results because the design and analyses were not structured such that any of these interpretations could be teased apart.

      (3) Same theme: the analyses shift from frontal regions (when looking at encoding) to hippocampus and precuneus (when looking at temporal recency). This shift in ROIs is confusing. The analyses (encoding vs. recognition) are essentially confounded with the ROIs (frontal vs. hippocampal/precuneus), so it's hard to know whether different analyses yielded different patterns or different ROIs yielded different patterns. Why were the frontal regions that were important for encoding ignored for the temporal recency judgments? And the fact that medial temporal lobe regions showed opposite effects to the frontal regions during encoding did not get much attention. Given that there were opposing patterns (dissimilarity vs. similarity) across different brain regions, the framing of the paper (that "the method of loci may bolster uniqueness") feels like a very selective representation of the data.

      (4) One of the more surprising aspects of the analyses (or at least one of the analyses) is that representational similarity analyses (RSA) are used to compare the average activity pattern (averaged across all trials) between different individuals. At a conceptual level, this really just reduces to a univariate analysis. It is not standard (or intuitive) to think about RSA that is essentially blind to the actual representational content. In other words, averaging across trials obviously washes out the content, and what is left are process-level effects. For process-level analyses, univariate analyses are far more common and seem more straightforward. However, these 'RSA' analyses are described as reflecting the "uniqueness of each word-location association" (an account which strongly implies content-level effects). This feels like an inappropriate description of what the analyses actually reflect.

      (5) I think the analysis looking at trial-by-trial similarity during word encoding (showing greater dissimilarity among the experienced individuals) is a somewhat interesting result, but again, I think the interpretation is very difficult. It is hard (or, impossible, I think) to get a clear sense of what is driving those differences. Is it the association of a unique spatial context? Is it somehow a product of better encoding, per se (as opposed to distinct spatial contexts)? These things could be tested by actually manipulating the spatial contexts in a more controlled way. For example, the paper by Liu et al. that is cited several times - and also a just-published paper by Christopher Baldassano (Nature Human Behaviour) - each used a very controlled paradigm where the (imagined) spatial location associated with each item was known/manipulated. However, the design of the current study does not allow for these things to be teased apart.

      (6) Relatedly, the training group seemed to receive instruction on a common spatial route, but, surprisingly, "Participants were free to choose which route and how many they would use to anchor the 72 items." Thus, if I understand correctly, we don't know whether the trained individuals were using common or distinct locations. And the fact that they learned a 50-location route but then studied a 72-word list is also a bit strange. Not having control or knowledge of the location that was associated with each word (sequence position) is a major limitation and also a major difference between the current study and other recent studies. For that matter, the word order was also randomized, so there was no control over whether the words and/or locations matched. These issues really complicate interpretation.

      (7) Again, same theme: for the result showing lower trial-by-trial similarity (within-subject similarity), the question is why, exactly, training/experience is associated with lower trial-by-trial similarity. Does training specifically or preferentially lead to greater differentiation between temporally-adjacent trials (as in Liu et al)? Does it lead to greater differentiation IF subjects associate each word with a unique location? Or maybe there is a more abstract effect of sequence/position that is independent of spatial location? Importantly, each of these three possibilities that I mention here has a precedent in prior studies that were more tightly controlled. But here, there is no way to tease these apart because of the experimental design, limiting the conclusions.

      (8) The ISC analysis described on p. 9 (line 328) is confusing. If I understand correctly, correlations between different trials were not computed (e.g., subject 1 trial 1 was not correlated with subject 2 trial 2). Rather, trial 1 was always correlated with trial 1 (in other subjects). Thus, it is not clear whether trial-level alignment matters at all. Maybe the same results would be obtained if there were no correspondence across subjects in trial number. Or if the trial order was shuffled within the subject. Given this, I simply don't know how to think about the data. And why did memory athletes show higher pattern similarity in this analysis as opposed to lower pattern similarity (as in some other analyses)? And why was this analysis performed by comparing memory athletes to each other as opposed to memory athletes to non-athletes? And, conceptually, why was this selective to the memory athletes or to the precuneus? And why was it selective to the temporal order test and not encoding? I am not asking the authors to answer each of these questions; rather, the point I am trying to make is that this analysis, and many of the analyses, seem to raise more questions than they answer.

      (9) The ISC analyses are interpreted in terms of scene construction and context reinstatement, but these conclusions go (very) far beyond what the data actually shows. Again, I don't see how this analysis lends itself to a meaningful conclusion. And this general critique applies to many of the analyses reported in this paper.

      (10) The fact that words were in random order per subject also makes the ISC analysis even more confusing to think about. The memory athletes had unique spatial routes (that they used for the method of loci) and unique word lists. So, why would it make sense to look at trial-level ISC? At a conceptual level, I simply don't understand what this is intended to capture.

      (11) Differences in the pattern of results between the encoding and temporal memory recognition task are hard to make sense of and are not addressed in much detail. Why would it make more sense to have across-trial similarity during recognition than during encoding? I think any account of this is very speculative.

    3. Reviewer #2 (Public review):

      The authors aim to understand how intensive training with the method of loci changes the brain systems that support memory in both elite "memory athletes" and previously untrained adults. They combine a cross-sectional comparison of athletes and matched controls with a longitudinal training study including mnemonic training, active working-memory training, and passive control groups, and use fMRI pattern-similarity analyses to characterise how brain activity patterns during learning and temporal-order judgments become more distinct or more shared within and across individuals.

      The dual design is a major strength. It combines findings from both real-world expertise and experimentally induced training and adds well-matched control groups. The representational similarity analyses are appropriate and reveal a clear, internally consistent picture in which learning with the method of loci leads to more idiosyncratic prefrontal and posterior cortical patterns during encoding, and more shared hippocampal-precuneus patterns during temporal-order retrieval, observed in both athletes and trained novices.

      However, the study is complex and the manuscript dense, and some secondary analyses feel less central or are difficult to interpret. More importantly, while the neural evidence for training-related changes in representational format is compelling, the behavioural relevance of these changes is less clearly supported. The key per-group brain-behaviour correlations are weak and inconsistent, and the direct association between neural and behavioural change across all subjects is not clearly presented.

      Overall, the work convincingly shows that extensive mnemonic practice reorganises neural representations in specific networks, but the strength and specificity of the claimed link to long-term memory improvements should be viewed as more tentative.

    4. Reviewer #3 (Public review):

      Summary:

      This study sought to explore how neural representations during encoding change with expertise or proficiency in the method of loci (MoL). To do this, the authors compared three groups: memory athletes (experts in MoL), naive controls, and naive participants before and after 6 weeks of MoL training and analyzed how similar their encoding-related activity patterns were across groups and training. They found that in lateral prefrontal, inferior temporal, and posterior parietal regions, pattern similarity decreased with expertise and training. They also found that changes in similarity between pre- and post-training were associated with improvements in memory performance measured 4 months later. Additionally, in a follow-up exploratory analysis on the temporal order recognition task, neural patterns were more similar for those proficient in MoL - a contrast to the decrease seen at encoding. Taken together, the authors interpret these findings as evidence that proficiency with the method of loci is associated with distinct encoding representations: Broadly, the findings suggest that greater representational differentiation at encoding may be associated with better memory.

      Strengths:

      (1) The manuscript is impressively rich with analyses. Their general claim that neural differentiation increases between individuals with MoL experience is thus addressed in this work. Specifically, the authors effectively explore different levels of granularity to tackle the question of whether a participant's neural representation (with MoL experience) looks more similar to that of another (with less experience) during encoding.

      (2) The authors connect their hypotheses about neural representational differences caused by training to behavioral data (and 4 months later at that).

      (3) Although exploratory, they not only look at encoding-related differences, but also retrieval-related differences.

      (4) The authors provide many supplementary figures with complementary and interesting findings. As I read, I found myself curious about exploratory analyses, which were then addressed in supplementary figures.

      Weaknesses:

      (1) The manuscript is impressively rich, but the number of analyses and levels of comparison (and how they are presented) made it difficult to follow. The paper would benefit from an anticipatory introductory paragraph (or an introductory Results paragraph) that explicitly states the hypotheses and which sections of the results addressed them. Additionally, given how this is a Methods-last formatted paper, the manuscript would benefit from a few introductory sentences at each Results section describing the methodology.

      (2) One of the motivations needs strengthening. Given the introduction, the manuscript seems to be motivated by two complementary questions: (i) whether neural differentiation effects reported with short-term MoL training (as done in Liu et al., 2022) extend with longer-term training and expertise and (ii) whether training might lead individuals towards a canonical "expert" representation that can only be acquired through training as has been previously shown in other work (e.g., Meshulam et al., 2021).

      The first motivation is clear and compelling. The second one, however, does not feel as well grounded. In studies like Meshulam et al., alignment is expected because participants are exposed to the same stimulus or concept. In contrast, as the authors note, a user of the method of loci is encouraged to create unique, vivid representations of their loci and to-be-remembered items - here, neural alignment is at odds with the premise of the technique. As such, the described tension between increased pattern similarity across the studies cited in the second paragraph of the introduction and individuals proficient with MoL feels underdeveloped (despite the reference-rich second paragraph).

      The authors would benefit from articulating why the counterfactual of "increased neural alignment" might be expected, specifically, in the method of loci. In other words, why should we expect trainees to become more similar to experts when the strategy itself promotes idiosyncratic representations? Perhaps, the authors could distinguish between alignment at the level of knowledge representations vs the process of encoding (e.g., the act of placing items into loci).

      (3) Relatedly, terminology referencing the employed methodology is a bit unclear. In some of the papers cited that look at pattern similarity across people (like Meshulam et al., or Koch et al.), the spatial patterns of individuals are compared with 'template' patterns that reflect the canonical representation of a concept or episode. However, the manuscript does not include this type of template-based comparison. This is understandable because there may not be a representative canonical pattern when each participant has their own idiosyncratic palace. In this case, a pairwise comparison may be more fitting as it focuses on the distances between people's representations instead of the distances between them and a group template. Although both comparisons (pairwise and template-based similarities) are related, they have different interpretations. A clearer justification for why pairwise similarity, instead of template-based similarity (as in many of the cited papers), is the more appropriate metric in this paradigm early on would add to the clarity of the work. Additionally, this slight difference in methodology was confusing because some portions of the text (including the figures) say "group average", but in others, we see "pairwise".

      Minor Comments:

      A recent paper (Masis-Obando et al., 2026, Nat Hum Behav) shows that stable and distinctive spatial representations can support later reinstatement of items placed within those contexts. Their conclusions seem to support your hypotheses and results here. In parallel, prior work (like Robin et al., 2018, J Neurosci) emphasizes the importance of spatial contexts for the representation of events. Given how MoL encoding relies on vivid context-item binding, including these perspectives in the Introduction (and/or discussion) may help frame the current findings within the broader memory literature.

      Overall, this work provides rich and valuable contributions to the field.

    1. eLife Assessment

      This study provides an insight into the role of a Chi3l1 in liver macrophages during metabolic disease. The evidence is solid with the authors now addressing most concerns, although one key conclusion is not fully supported by the data presented. Overall, the work offers a useful contribution to the field.

    2. Reviewer #2 (Public review):

      In this revised version of the manuscript, the authors have addressed many of my concerns. The representative confocal images now provided, allow for a much better assessment of the claims being made and hence the data to be understood, for example the level of protein expression of Chi3l1 in the macrophages.

      There is just 1 concern remaining, which is a main claim of the manuscript, that loss of Chi3l1 drives KC death in MASLD. This claim is made based on gene expression profiles and the presence of Tunel staining in liver sections. However the KC numbers are not altered compared with WT when assessed by flow cytometry. This discrepancy is not really addressed. If the cells are not actually dying this would explain the lack of moKCs (a concern raised by reviewer 1) and would indeed suggest that the loss of these cells is, as suggested by that reviewer, trivial in this timeframe. The authors propose in their rebuttal that the KCs are in a prolonged state of stress, explaining the Tunel staining, but to make the claim that they die, the authors need to show their eventual loss from the liver. Otherwise the claims of death should be revised.

    3. Reviewer #3 (Public review):

      This paper investigates the role of Chi3l1 in regulating the fate of liver macrophages in the context of metabolic dysfunction leading to the development of MASLD.

      Comments on revisions:

      My comments have been addressed.

    4. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews:

      Reviewer #1 (Public review):

      The manuscript by Shan et al seeks to define the role of the CHI3L1 protein in macrophages during the progression of MASH. The authors argue that the Chil1 gene is expressed highly in hepatic macrophages. Subsequently, they use Chil1 flx mice crossed to Clec4F-Cre or LysM-Cre to assess the role of this factor in the progression of MASH using a high fat high, fructose diet (HFFC). They found that loss of Chil1 in KCs (Clec4F Cre) leads to enhanced KC death and worsened hepatic steatosis. Using scRNA seq they also provide evidence that loss of this factor promotes gene programs related to cell death. From a mechanistic perspective they provide evidence that CHI3L serves as a glucose sink and thus loss of this molecule enhances macrophage glucose uptake and susceptibility to cell death. Using a bone marrow macrophage system and KCs they demonstrate that cell death induced by palmitic acid is attenuated by the addition of rCHI3L1. While the article is well written and potentially highlights a new mechanism of macrophage dysfunction in MASH and the authors have addressed some of my concerns there are some concerns about the current data that continue to limit my enthusiasm for the study. Please see my specific comments below.

      Major:

      (1) The authors' interpretation of the results from the KC (Clec4F) and MdM KO (LysMCre) experiments is flawed. The authors have added new data that suggests LyM-Cre only leads to a 40% reduction of Chil1 in KCs and that this explains the difference in the phenotype compared to the Clec4F-Cre. However, this claim would be made stronger using flow sorted TIM4hi KCs as the plating method can lead to heterogenous populations and thus an underestimation of knockdown by qPCR. Moreover, in the supplemental data the authors show that Clec4f-Cre x Chil1flx leads to a significant knockdown of this gene in BMDMs. As BMDMs do not express Clec4f this data calls into question the rigor of the data. I am still concerned that the phenotype differences between Clec4f-cre and LyxM-cre is not related to the degree of knockdown in KCs but rather some other aspect of the model (microbiota etc). It woudl be more convincing if the authors could show the CHI3L reduction via IF in the tissue of these mice.

      We thank the reviewer for these constructive comments. We have performed FACSsorting of KCs (CD45<sup>+</sup> F4/80<sup>hi</sup> CD11b<sup>low</sup> TIM4<sup>hi</sup>) or MoMFs (CD45<sup>+</sup> F4/80<sup>low</sup> CD11b<sup>hi</sup> Ly6G<sup>-</sup> TIM4<sup>-</sup>) from Chil1<sup>fl/fl</sup> and Lyz2<sup>∆Chil1</sup> or Clec4f<sup>∆Chil1</sup>mice, respectively. Compared with Chil1<sup>fl/fl</sup> mice, mRNA levels of Chil1 was reduced more than 90% in KCs from Clec4f<sup>∆Chil1</sup> mice while not different in MoMFs (Revised Figure S3B). Besides, compared with Chil1<sup>fl/fl</sup> mice, mRNA levels of Chil1 was reduced more than 90% in MoMFs from Lyz2<sup>∆Chil1</sup> mice while roughly 40% in KCs (Revised Figure S5B). This revised data support the phenotypic difference between Lyz2-CKO and Clec4f-CKO mice.

      We agree with the reviewer that the significant knockdown of Chil1 in BMDM from Clec4f<sup>∆Chil1</sup>mice is confusing. To keep the rigor of our data, we remove this part from our manuscript. 

      Additionally, we performed immunofluorescence staining to detect Chi3l1 expression in liver tissues of these mice. The results show a reduction of Chi3l1 expression in KCs (TIM4+F4/80+ cells) of both Lyz2<sup>∆Chil1</sup>and Clec4f<sup>∆Chil1</sup>mice, with a more pronounced decrease in Clec4f<sup>∆Chil1</sup>mice (Author response image 1). 

      Author response image 1.

      The expression of Chi3l1 in liver tissues of Chil1<sup>fl/fl</sup>, Lyz2<sup>∆Chil1</sup>and Clec4f<sup>∆Chil1</sup>mice. Immunofluorescent staining to detect Chi3l1(green) expression in liver sections of Chil1<sup>fl/fl</sup>, Lyz2<sup>∆Chil1</sup>and Clec4f<sup>∆Chil1</sup>mice under normal chow diet. TIM4 (KCs marker, white), F4/80 (macrophage marker, red), nuclei were counterstained with DAPI, Scale bar=20 µm and 10 µm (Inset).

      (2) Figure 4 suggests that KC death is increased with KO of Chil1. The authors have added new data with TIM4 tht better characterizes this phenotype. The lack of TIM4 low, F4/80 hi cells further supports that their diet model is not producing any signs of the inflammatory changes that occur with MASLD and MASH. This is also supported by no meaningful changes in the CD11b hi, F4/80 int cells that are predominantly monocytes and early Mdms). It is also concerning that loss of KCs does not lead to an increase in Mo-KCs as has been demonstrated in several studies (PMID37639126, PMID:33997821). This would suggest that the degree of resident KC loss is trivial.

      We appreciate the reviewer’s insightful comment. We agree that our data show no substantial generation of monocyte-derived Kupffer cells (MoKCs) within the 16-week HFHC model. However, we do not believe the degree of resident KC loss is trivial, since 60% of KCs die at 16 weeks compared with 0 week (Revised Figure 5D). Instead, our observations align with a phased replacement model: recruited monocytes first differentiate into monocyte-derived macrophages (MoMFs), which we see accumulate (Revised Figure 5D), and only later adopt a KC phenotype. Consistent with this, our 16-week model shows significant EmKC loss and MoMFs expansion, but not yet the emergence of TIM4-MoKCs. This timing is supported by prior studies, where TIM4KCs were observed at 24 weeks, but not at 16 weeks, on similar diets (PMID: 33440159; PMID: 32888418). Therefore, we interpret our findings as capturing an earlier phase of MASLD progression, characterized by EmKC death and MoMF accumulation, prior to their full differentiation into MoKCs.

      (3) The authors demonstrated that Clec4f-Cre itself was not responsible for the observed phenotype, which mitigates my concerns about this influencing their model.

      We thank the reviewer for this comment and are pleased they agree that our control experiment using Clec4f-Cre alone confirms that the phenotype is specific to our genetic manipulation and not an artifact of the Cre driver.

      (4) I remain somewhat concerned about the conclusion that Chil1 is highly expressed in liver macrophages. The author agrees that mRNA levels of this gene are hard to see in the datasets; however, they argue that IF demonstrates clear evidence of the protein, CHI3L. The IF in the paper only shows a high power view of one KC. I would like to see what percentage of KCs express CHI3L and how this changes with HFHC diet. In addition, showing the knockout IF would further validate the IF staining patterns.

      We thank the reviewer for their thoughtful and constructive feedback. We agree that our initial conclusion regarding Chil1 expression in liver macrophages relied heavily on prior observations and was not sufficiently supported by the data presented. In response, we have revised our conclusion to state: "Hepatic macrophages express Chi3l1 and upregulate its expression following HFHC feeding." (Revised manuscript, page 4, line 136-137)

      To strengthen this finding, we have replaced the original high-power image of a single Kupffer cell with a representative low-power view showing multiple F4/80+ macrophages (Revised Figure 1A). Furthermore, we performed quantitative colocalization analysis, which revealed that under normal chow diet (NCD), approximately 8% of F4/80+ macrophages are Chi3l1-positive. This proportion significantly increases to 15% upon HFHC feeding (Revised Figure 1A).

      Additionally, to validate the specificity of the Chi3l1 immunofluorescence signal, we have included staining of liver sections from Chil1 knockout mice. In contrast to wildtype mice, Chi3l1 signal was completely absent within F4/80+ macrophages in Chil1<sup>-/-</sup> mice, confirming the specificity of the staining (Revised Figure 1B, Revised manuscript, page 4, line 152-157).

      Minor:

      (1) The authors have answered my question about liver fibrosis. In line with their macrophage data their diet model does not appear to induce even mild MASH.

      We thank the reviewer for this observation. We agree that under our HFHC dietary conditions, the mice do not develop MASH pathology. However, we believe this earlystage model is a strength of our study, as it allows us to dissect the initial role of the Chi3l1-glucose interaction in regulating Kupffer cell fate during early MASLD, prior to the onset of significant fibrosis. This approach enables us to capture early macrophage adaptations (such as Chi3l1 upregulation) that might otherwise be masked or become secondary to the overt inflammation and scarring characteristic of late-stage MASH models.

      Reviewer #2 (Public review):

      In the revised version of the manuscript, the authors have attempted to address my questions, however, a number of my original concerns still remain.

      Firstly, I had asked for a validation of the different CRE lines used - Lysm and Clec4f. The authors have now looked at BMDMs and KCs (steady state) from these animals. They conclude LysM only targets BMDMs not KCs, while CLEC4F targets both KCs and BMDMs. This I do not understand, BMDMs do not express CLEC4F so why are they targeted with this CRE? Additionally, BMDMs are not the correct control here, rather the authors should look at the incoming moMFs in the livers of these mice in the MASLD setting. Similarly, the KO in the MASLD KCs should be verified.

      We thank the reviewer for these constructive comments. We have performed FACSsorting of KCs (CD45<sup>+</sup> F4/80<sup>hi</sup> CD11b<sup>low</sup> TIM4<sup>hi</sup>) or MoMFs (CD45<sup>+</sup> F4/80<sup>low</sup> CD11b<sup>hi</sup> Ly6G<sup>-</sup> TIM4<sup>-</sup>) from Chil1<sup>fl/fl</sup> and Lyz2<sup>∆Chil1</sup> or Clec4f<sup>∆Chil1</sup>mice fed NCD or HFHC for 4 weeks, respectively. Compared with Chil1<sup>fl/fl</sup> mice, mRNA levels of Chil1 was reduced more than 90% in KCs from Clec4f<sup>∆Chil1</sup> mice while not different in MoMFs at both 0 and 4 weeks (Revised Figure S3B). Besides, compared with Chil1<sup>fl/fI</sup mice, mRNA levels of Chil1<sup>fl/fI</sup was reduced more than 90% in MoMFs from Lyz2<sup>∆Chil1</sup> mice while roughly 40% in KCs at both 0 and 4 weeks (Revised Figure S5B). This revised data support the phenotypic difference between Lyz2-CKO and Clec4f-CKO mice. 

      Then I had asked for validation of macrophage expression of Chil1 in other MASLD human and mouse datasets. The authors have looked into this, but the data provided do not suggest it is highly expressed by these cells either in the other mouse models or in the human. Nevertheless, they include a statement suggesting a similar expression pattern (although also being expressed by other cells). This is not an accurate discussion of the data and hence must be revised. This also prompted me to take another look at their data and this has left me querying the data in Figure 1D. Is the percent expressed 1%? In Figure 1C the scale goes from 0-100 but here 0-1. If we are talking about expression in 1% of cells which would fit with the additional public mouse data now analysed then how relevant are any of these claims? How sure are the authors that the effects seen are through KCs/moMFs? In figure 1D all cells profiled by scRNA-seq should be shown not just MFs to get a better sense of this data. What is macrophage expression of Chil1 compared with all other liver cells?

      We thank the reviewer for the thoughtful feedback. We agree that the expression pattern of Chil1 should be described more accurately. To address this point, we examined four additional publicly available scRNA-seq datasets, including two mouse MASLD models and two human MASLD datasets (Author response image 2). Across these studies, the cell type with the highest Chil1 expression varied, whereas Chil1 transcripts were detected at relatively low frequency in macrophages (~1% of cells; Author response image 2C, E, K). To better present these data, we regenerated the UMAP plots to include all captured liver non-parenchymal cells, defined using the top two lineage specific markers (Author response image 3A–B). Consistent with Figure 2A–C, violin plots show that Chil1 is highly expressed in neutrophils, with only modest expression detected in macrophages (Author response image 3C). Further analysis of monocyte/macrophage subsets indicates that approximately ~1% of MoMFs or KCs express Chil1 (Author response image 3D–F). As the reviewer noted, the y-axis in Author response image 3F ranges from 0–1%, reflecting the low transcriptional detection frequency of Chil1 in macrophages, which is consistent with the additional public datasets analyzed.

      We also recognize that mRNA detection by scRNA-seq does not necessarily reflect protein abundance. Therefore, we assessed Chi3l1 protein expression in hepatic macrophages using immunofluorescence staining for F4/80, TIM4, and Chi3l1 in liver sections from mice fed either normal chow diet (NCD) or HFHC diet. These analyses show that Chi3l1 protein is detectable in both KCs (TIM4<sup>+</sup>F4/80<sup>+</sup>) and MoMFs (TIM4<sup>-</sup>F4/80<sup>+</sup>) (Revised Figure 1A). Quantitative colocalization analysis revealed that under NCD conditions, approximately 8% of F4/80<sup>+</sup> macrophages are Chi3l1-positive, which increases to ~15% following HFHC feeding (Revised Figure 1A). To confirm antibody specificity, we additionally performed staining in Chil1 knockout mice. In contrast to wild-type mice, Chi3l1 signal was completely absent in F4/80<sup>+</sup> macrophages from Chil1<sup>-/-</sup> mice, validating the specificity of the staining (Revised Figure 1B). Together, these results suggest that low-abundance Chil1 transcripts may be under-detected by scRNA-seq, whereas immunofluorescence captures accumulated protein. Importantly, our functional experiments using Clec4f-Cre– mediated deletion directly support that the observed phenotypes are mediated through Kupffer cells, regardless of expression levels in other liver cell types.

      In response to the reviewer’s comments, we have made the following revisions:

      (1) Softened our conclusion to: “Hepatic macrophages express CHI3L1 and upregulate its expression following HFHC feeding” (Revised manuscript, page 4, lines 136–137).

      (2) Included representative low-magnification images showing multiple F4/80<sup>+</sup> macrophages along with quantitative analysis (Revised Figure 1A).

      (3) Added immunofluorescence staining of Chil1<sup>-/-</sup> liver sections demonstrating complete absence of Chi3l1 signal in F4/80<sup>+</sup> macrophages, validating antibody specificity (Revised Figure 1B).

      (4) Regenerated UMAP plots to display all liver non-parenchymal cells and clearly indicate the low detection frequency of Chil1 transcripts in macrophages (Author response image 3).

      (5) Revised the relevant text to more accurately describe Chil1 expression patterns in hepatic macrophages (Revised manuscript, page 4, lines 136–157).

      Author response image 2.

      Analysis of Chil1 expression in additional single-cell RNA sequencing datasets. (A-C) Chil1 expression in a mouse model of NASH. (A) t-SNE projection of cell clusters from scRNA-seq data (GSE1283338) of livers from C57BL/6J mice fed a control or NASH diet for 30 weeks. (B) Dot plot showing scaled Chil1 expression across all identified cell clusters. (C) Dot plot of scaled Chil1 expression after excluding the neutrophil cluster, highlighting expression in macrophage populations. Analyzed cell clusters and cell numbers: KC_H (healthy, 1178); KC3_Control (1142); KC_N (NASH, 1045); KN_RM (recruited macrophage in KC niche, 950); Proliferating_KC (364); PDC_Control (356); Ly6CHi_RM (320); LSEC (299); NK_NKT (393); B_cell (244); DC_1 (107); DC_2 (118); Ly6CLo_RM (127); Hepatocyte (57); PDC_NASH (46); Neutrophil (21). (D-E) Chil1 expression during NAFLD progression in a mouse Western diet model. (D) t-SNE projection of cell clusters from scRNA-seq data (GSE156059) of livers from C57BL/6J mice fed a Western diet with fructose/sucrose for 12, 24, and 36 weeks. (E) Dot plot showing scaled Chil1 expression across all identified cell clusters. Analyzed cell clusters and cell numbers: capsule macs (250), LAMs (1419), Ly6chi monocytes (6912), mac1 (638), moKCs (767), Patrolling monocytes (690), Prolif.macs (521), Resident KCs (3629), Transitioning monocytes (3615). (F-H) Chil1 expression in human cirrhotic liver biopsies. (F) t-SNE projection of cell clusters from scRNA-seq data (GSE136103) of healthy and cirrhotic human liver samples. (G) Dot plot showing scaled Chil1 expression across major cell lineages. (H) Dot plot of scaled Chil1 expression specifically within the mononuclear phagocyte (MP) population. Analyzed cell clusters and cell numbers: B cell (1951); cycling (967); Epithelia (3751); ILC (10091); mast cell (2511); Mesenchyme (2382); MP (10874); pDC (317); Plasma cell (877); T cell (19076). (I-K) Chil1 expression in a human NAFLD explant. (I) t-SNE projection of cell clusters from scRNA-seq data (GSE190487) of a human NAFLD liver explant. (J) Dot plot showing scaled Chil1 expression across all identified cell clusters. (K) Dot plot of scaled Chil1 expression within the MP subpopulations. Analyzed cell clusters and cell numbers: B cell (1278); Cycling (152); MP (2897); pDC (391); Plasma cell (85); T cell (1551); KC (403); SAMac (scar-associated macrophages, 723); TM (tissue monocytes, 1265).

      Author response image 3.

      Hepatic macrophages express Chi3l1. (A-D) Wildtype C57BL/6J mice were fed either a normal chow diet (NCD) or HFHC for 16 weeks. NPCs were isolated and subjected to BD Rhapsody scRNA sequencing. (A) Uniform manifold approximation and projection (UMAP) plots illustrate the clustering of NPCs from the livers of mice fed NCD and HFHC. Major cell types are colored. (B) Heatmap showing the mean expression of top2 markers of each cell type. (C) Violin plots show the RNA expression of Chil1 between NCD and HFHC livers in each cell cluster. (D) UMAP plots depict the clustering of Monocytes/Macrophages in the livers of mice fed NCD and HFHC. Cell clusters are color-coded. (E) Dot plot displays the scaled gene expression levels of lineage-specific marker genes in different cell clusters. (F) Dot plot shows the scaled gene expression levels of Chil1 in the indicated cell clusters.

      The cell death had also previously concerned me that 40-60% of KCs were tunel +ve. I do not understand how 60% are +ve at 8 weeks but then they have more or less same number of TIM4+ cells at 16 weeks? How can this be? why do the tunel +ve cells not die? This concern remains as I don't understand how they reached these numbers given the images. Additional, larger images were also not provided to be sure that they are representative images in the figure. Now in the images provided, there are clearly cells which are TIM4+ where the tunel does not overlap, likely it is in a LSEC or other neighbouring cell. Indeed also taking Fig S11b as an example there are ˜7KCs and at best 1 expresses tunel so how do they get to 60%?

      We thank the reviewer for these constructive feedback. We agree that the sustained TUNEL positivity without corresponding KC depletion presents an apparent paradox. Based on our data, we propose that TUNEL-positive KCs represent cells in a prolonged stressed or pre-apoptotic state rather than undergoing immediate clearance. This interpretation is supported by the relatively stable TIM4+ cell numbers between 8 and 16 weeks, which would be inconsistent with rapid cell death and removal. Previous studies (PMID: 33440159; PMID: 32888418) have similarly documented gradual KC loss during MASLD progression, supporting our view that KC death occurs over an extended timeframe rather than acutely.

      Regarding quantification concerns, we acknowledge that the representative images in the original figure may have been misleading. To address this, we have now quantified KC apoptosis using low-magnification fields across multiple liver sections to ensure statistical rigor. Figure S11B (now Revised Figure S9B) presents these data, showing that under NCD conditions, KC apoptosis rates are minimal in both genotypes. Following HFHC feeding, apoptosis rates are comparable between Chil1<sup>fl/fl</sup> and Lyz2<sup>Δ Chil1</sup> mice. Importantly, we have replaced all TIM4/TUNEL co-staining images with lowmagnification representative images in the revised figures (Revised Figure 1A, 1B, 5E, S9A, S9B). These images better reflect the quantitative data and confirm that the originally highlighted high-magnification fields were not representative of global apoptosis rates.

      Reviewer #3 (Public review):

      This paper investigates the role of Chi3l1 in regulating the fate of liver macrophages in the context of metabolic dysfunction leading to the development of MASLD. I do see value in this work, but some issues exist that should be addressed as well as possible.

      Here are my comments:

      (1) Chi3l1 has been linked to macrophage functions in MASLD/MASH, acute liver injury, and fibrosis models before (e.g., PMID: 37166517), which limits the novelty of the current work. It has even been linked to macrophage cell death/survival (PMID:31250532) in the context of fibrosis, which is a main observation from the current study.

      We thank the reviewer for raising this important point and acknowledge previous studies linking Chi3l1 to macrophage function in liver disease. However, several aspects of our work extend beyond these prior reports. First, although global Chi3l1 deficiency has been shown to promote macrophage apoptosis in toxin-induced fibrosis models (PMID: 31250532), our study demonstrates that Chi3l1 differentially regulates the fate of distinct hepatic macrophage subsets embryo-derived Kupffer cells (KCs) and monocyte-derived macrophages (MoMFs)—in MASLD. To our knowledge, this subset-specific regulation of hepatic macrophages has not been previously described. Second, we identify a previously unrecognized metabolic mechanism by which Chi3l1 regulates macrophage survival. Specifically, we find that Chi3l1 binds glucose and promotes glucose uptake, thereby protecting the highly glucose-dependent KCs from metabolic stress–induced death, while exerting minimal effects on MoMFs. This mechanism is distinct from the previously reported Fas/Akt-mediated pathway (PMID: 31250532) and highlights a metabolic checkpoint controlling macrophage subset– specific vulnerability. Third, our findings reveal context- and cell type-dependent roles of Chi3l1. While myeloid-specific deletion of Chi3l1 has been reported to ameliorate steatohepatitis and fibrosis (PMID: 37166517), our KC-specific deletion model shows that loss of Chi3l1 in KCs exacerbates disease, indicating a previously unrecognized protective role of Chi3l1 in KCs during early MASLD. Together, these findings provide new insights into macrophage subset-specific regulation, identify a novel glucose related metabolic mechanism, and reveal context-dependent functions of Chi3l1 in MASLD pathogenesis.

      (2) The LysCre-experiments differ from experiments conducted by Ariel Feldstein's team (PMID: 37166517). What is the explanation for this difference? - The LysCre system is neither specific to macrophages (it also depletes in neutrophils, etc), nor is this system necessarily efficient in all myeloid cells (e.g., Kupffer cells vs other macrophages). The authors need to show the efficacy and specificity of the conditional KO regarding Chi3l1 in the different myeloid populations in the liver and the circulation.

      We thank the reviewer for raising this important point regarding the specificity of the genetic models and the apparent discrepancy with the study by Feldstein and colleagues (PMID: 37166517). To address these concerns, we performed additional experiments to directly assess the efficiency and cell-type specificity of Chi3l1 deletion in our models.

      (1) Efficiency and specificity of LysM-Cre and Clec4f-Cre models

      We isolated KCs (CD45<sup>+</sup> F4/80<sup>hi</sup> CD11b<sup>low</sup> TIM4<sup>hi</sup>) or MoMFs (CD45<sup>+</sup> F4/80<sup>low</sup> CD11b<sup>hi</sup> Ly6G<sup>-</sup> TIM4<sup>-</sup>) by FACS from Chil1<sup>fl/fl</sup>, Lyz2<sup>∆Chil1</sup> and Clec4f<sup>∆Chil1</sup>mice fed either NCD or HFHC diet. Consistent with the known specificity of these Cre lines, Clec4f-Cre resulted in >90% reduction of Chil1 mRNA in KCs with no significant change in MoMFs (Revised Figure S3B), confirming efficient KC-specific deletion. In contrast, LysM-Cre reduced Chil1 expression by >90% in MoMFs but only ~40% in KCs (Revised Figure S5B). These data support the reviewer’s concern that LysM-Cre mediates incomplete recombination in KCs, whereas the Clec4f-Cre model provides KC-specific deletion, explaining why the phenotype observed in Lyz2<sup>∆Chil1</sup> mice is relatively modest.

      (2) Relationship to the study by Feldstein et al.

      We agree that our LysM-Cre results appear different from those reported by Feldstein and colleagues. However, considering the new recombination data and differences in disease models, we believe the findings are complementary rather than contradictory. First, the disease models differ substantially. Feldstein et al. used a CDAA-HFAT diet for 10 weeks, which rapidly induces severe inflammation and fibrosis, whereas our study employed a long-term HFHC diet, modeling the more gradual metabolic progression of MASLD. These distinct disease contexts may engage different CHI3L1dependent pathways. Second, the mechanistic focus differs. Feldstein et al. reported that myeloid Chi3l1 promotes steatohepatitis and fibrosis through inflammatory macrophage recruitment and IL13Rα2-mediated stellate cell activation. In contrast, our study identifies a metabolic mechanism in which CHI3L1 binds glucose and promotes glucose uptake, protecting metabolically vulnerable KCs from stress-induced death. Finally, and importantly, KC-specific deletion using Clec4f-Cre recapitulates the key phenotypes observed in our study, including effects on KC survival and metabolic regulation. This confirms that the observed effects are KC-autonomous and not due to broader Cre activity in other myeloid populations.

      Together, these additional experiments clarify the recombination efficiency of our models and demonstrate that our conclusions are supported by KC-specific genetic evidence.

      (3) The conclusions are exclusively based on one MASLD model. I recommend confirming the key findings in a second, ideally a more fibrotic, MASH model.

      We thank the reviewer for this valuable suggestion. To address this point, we tested our key findings in an additional MASH model using a methionine–choline-deficient (MCD) diet. First, we examined Chi3l1 expression in this model. Wild-type mice fed an MCD diet for 6 weeks showed significantly increased Chi3l1 mRNA and protein levels in liver tissues compared with NCD controls, confirming diet-induced upregulation (Revised Figure 3A–B). To determine the functional contribution of Kupffer cell–derived Chi3l1, we subjected Clec4f<sup>ΔChil1</sup> mice and Chil1<sup>fl/fl</sup> controls to MCD feeding for 6 weeks. Body weight was comparable between genotypes throughout the feeding period (Revised Figure 3C). However, KC-specific deletion of Chi3l1 significantly exacerbated MCD diet–induced liver pathology, including increased steatosis, inflammation, and fibrosis, as indicated by higher MASLD activity scores, enhanced Oil Red O staining, increased Sirius Red deposition, and elevated α-SMA expression (Revised Figure 3D). Consistent with these histological findings, Clec4f<sup>ΔChil1</sup> mice exhibited an increased liver index, whereas serum ALT levels remained comparable between groups, suggesting increased hepatic lipid accumulation rather than aggravated hepatocellular injury (Revised Figure 3E). In addition, serum and hepatic triglyceride levels and serum cholesterol were significantly elevated, while hepatic cholesterol levels were not significantly different from controls (Revised Figure 3E). Together, these results validate our findings in an independent MASH model and further support a protective role for Kupffer cell–derived Chi3l1 in limiting steatosis and disease progression (Revised manuscript, page 5, line 188-205).

      (4) Very few human data are being provided (e.g., no work with own human liver samples, work with primary human cells). Thus, the translational relevance of the observations remains unclear.

      We thank the reviewer for raising this important point. We agree that additional human validation would further strengthen the translational relevance of our findings. We initially attempted to examine macrophage cell death in human liver samples by performing TUNEL and F4/80 co-staining on human liver cancer tissues. However, we did not detect clear colocalization in these samples. We speculate that this may reflect differences in disease context and stage, as the available samples represent endstage liver disease, whereas our study focuses on early MASLD progression. Despite this limitation, we provide several lines of evidence supporting the human relevance of our findings. First, analysis of multiple public human MASLD scRNA-seq datasets demonstrates Chi3l1 expression in hepatic macrophages (Figure 2F–K). Second, analysis of public bulk RNA-seq datasets shows that Chi3l1 expression positively correlates with MASLD disease activity and progression (Revised Figure 1EF). Third, our observations are consistent with previous clinical studies reporting elevated CHI3L1 levels in patients with MASLD/MASH and advanced liver disease. We acknowledge that functional validation in primary human macrophages or human liver tissues would further strengthen the translational significance of this work. This limitation and future direction have now been added to the Discussion (Revised manuscript, page 10, lines 409–411).

      Comments on revisions:

      The authors have done a thorough job addressing my comments. However, I am not convinced about the MCD diet model, which is somewhat hidden in the Supplementary Files. Neither seems MASH different nor are any fibrosis data shown to support the conclusions. I am not satisfied with this part of the revised manuscript, and I do not agree that the second MASH model would support the conclusions.

      We thank the reviewer for their continued careful evaluation and for highlighting the need for clearer presentation of the MCD model data. To address this concern, we have substantially revised this section of the manuscript. First, the MCD model results have now been moved from the Supplementary Figure to a new main figure (Revised Figure 3) to improve visibility and clarity. Second, we have added additional fibrosis analyses, including Sirius Red staining and α-SMA immunostaining, to directly assess fibrotic changes. These analyses show that MCD feeding induces significant collagen deposition in control mice and that fibrosis is further increased in Clec4f<sup>ΔChil1</sup> mice (Revised Figure 3D). Importantly, the MCD model recapitulates the key phenotypes observed in the HFHC model, with KC-specific Chi3l1 deletion leading to increased MASLD progression. These findings support the conclusion that the protective role of Kupffer cell–derived Chi3l1 is not restricted to a single dietary model, but is observed across distinct models of steatohepatitis. We hope that these revisions clarify the results and strengthen the evidence supporting our conclusions.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      Minor:

      Line 73 - should be moMfs not moKCs

      We thank the reviewer for this helpful comment. The term moKCs was used intentionally in line 73 to refer to monocyte-derived Kupffer cells, rather than MoMFs (monocyte-derived macrophages). To avoid potential confusion, we have clarified the terminology in the revised manuscript.

      Methods: diet is mentioned for 6 weeks but for HFHC should be 16.

      The correction has been made in the Methods section (page 3,line115).

      Liver/body weight ratios are >3 then I think it is body/liver weight ratio?

      We thank the reviewer for this query. The reported values represent liver-to-body weight ratios, calculated as (liver weight ÷ body weight) × 100%. A value of ~3% is consistent with the expected range for mice with MASLD-associated hepatomegaly.

      This clarification has been added to the revised figure legend.

      Figure 5F - what happens in Clec4f-CRE mice fed HFHC?

      We thank the reviewer for this question. Western blot analysis showed that the HFHC diet upregulated Chi3l1 protein in the livers of Clec4f-Cre mice post HFHC diet (Author response image 4.), similar to the increase observed in wild-type mice.

      Author response image 4.

      The expression of Chi3l1 in serum of Clec4f cre mice. (A) Western blot to detect Chi3l1 expression in murine serum of Clec4f cre mice before and after HFHC feeding. n=3 mice/group.

    1. eLife Assessment

      This study presents an important examination of the role of cis-acting versus trans-acting genetic variation on DNA methylation divergence between humans and chimpanzees, including its consequences for gene expression. By differentiating fused interspecies tetraploid cell lines into multiple cell types, the study provides compelling evidence for the importance of cis-acting changes, but incomplete evidence that these changes are of importance for adaptive trait evolution in humans. This work will be of interest to biologists and evolutionary anthropologists studying the evolution and genetics of gene regulation, particularly in primates.

    2. Reviewer #1 (Public review):

      Ma et al. use human-chimpanzee tetraploid cells across different cell types to identify the genetic causes and then transcriptomic consequences of divergence in DNA methylation. They conclude that the evolution of DNA methylation is driven primarily by cis-regulatory changes, and that the evolution of CpG sites contributes to cis-regulation, while transcription factor expression underlies some trans changes. They then argue that divergence in DNA methylation is associated with changes in gene expression and may contribute to human phenotypes.

      The tetraploid model is able to provide compelling evidence that most regulatory evolution occurs due to cis-regulatory changes. My only concern is that the extent of trans-changes may be overstated, as almost all are eliminated by changing from a nominal p-value criterion to even a 25% false discovery rate. The follow-up analyses are incomplete with major gaps. The authors focus on single potential mechanisms for cis- and trans-changes, but it is not clear to what degree these mechanisms explain the extent of cis and trans changes. There are also other mechanisms which are not investigated, such as the importance of TF binding sites for cis-regulatory evolution. While likely beyond the scope of this work, communicating these areas for future work would have helped define the niche for this manuscript.

      Next, the authors seek to show that differences in DNA methylation are functionally relevant. Consistent with previous results, they show that differences in DNA methylation are (weakly) associated with changes in gene expression. They hypothesize that genes with concordant regulatory elements should exhibit greater methylation-expression coupling than other genes and show that cis-expression/cis-methylation pairs are more strongly correlated than trans/trans pairs. However, I worry that this result could be confounded by larger effect sizes for cis-changes than trans effects. I also think that looking at cis/trans or trans/cis changes would have been useful to directly test the driving hypothesis. Another limitation is that this analysis is limited to promoter regions. It is not clear how many divergent DMRs are included and how many of those genes have differences in expression. The key question is whether differences in DNA methylation are functionally important, and the answer provided by these analyses is "sometimes".

      Finally, the authors make a case for lineage-specific selection on DNA methylation that is connected to human traits. This evidence was not convincing. In fact, it is even said that these tests cannot be interpreted as evidence of lineage-specific selection (lines 399-401), so I am confused why these results are framed as testing for selection. The evidence better supports an argument connecting DNA methylation to human phenotypes.

      In conclusion, I think this study provides a valuable resource for differences in DNA methylation between humans and chimpanzees across tissues, and provides important insight into the relative abundance of cis and trans regulatory evolution. Additional research is necessary to investigate the underlying regulatory mechanisms, and more care needs to be taken in exploring the functional consequences.

    3. Reviewer #2 (Public review):

      Summary:

      This manuscript investigates the causes and consequences of human-specific DNA methylation divergence relative to chimpanzees. The main aim of this study is to disentangle cis- and trans-regulatory contributions to DNA methylation differences, which the authors address using an innovative interspecies hybrid cell system differentiated into multiple cell types. This design allows them to control for trans-acting environments and directly compare allelic regulation.

      The authors show that cis-regulatory mechanisms dominate DNA methylation divergence and that methylation-expression coupling is strongest when both are cis-regulated. They further explore potential mechanisms underlying these patterns, including CpG-disrupting mutations and transcription factor-associated trans effects, and identify pathways that may reflect lineage-specific regulatory evolution.

      This study provides a valuable dataset and a compelling framework for understanding how local sequence variation contributes to epigenetic and transcriptional divergence, with likely broad impact in comparative and evolutionary genomics.

      Strengths:

      A major strength of this study is the use of human-chimpanzee hybrid cells, which provides a powerful system to disentangle cis- and trans-regulatory effects in a shared cellular environment. This experimental design allows for a more definitive assessment of regulatory mechanisms than traditional cross-species comparisons.

      The study also benefits from the inclusion of multiple differentiated cell types, increasing the robustness and generality of the conclusions. The consistent observation that cis-regulatory mechanisms dominate methylation divergence across these contexts is well supported by both CpG-level and DMR-level analyses.

      Another important contribution is the finding that methylation-expression coupling is strongest when both are cis-regulated. This provides a mechanistic explanation for previously observed weak global correlations between methylation and gene expression. Given that the nature of regulatory evolution is likely highly heterogeneous, this study adds valuable insights and guidelines for future investigations. I recommend that the authors provide a list of cis-cis-regulated promoters and their associated genes, which would be a valuable resource for the field.

      The application of the two-step sign test identifies biologically relevant pathways, suggesting links between epigenetic divergence and human-specific traits.

      The dataset itself, namely, comprehensive DNA methylation and gene expression across multiple cell types in shared cellular contexts, as well as a primary cell type, is a valuable resource for the field. Additionally, the application of the two-step sign test identifies biologically relevant pathways, suggesting links between epigenetic divergence and human-specific traits.

      Weaknesses:

      Although the authors identify transcription factors associated with differential methylation, it is unclear what proportion of differentially methylated CpGs or DMRs can be attributed to these factors. Providing a quantitative estimate would help assess the relative contribution of trans-acting regulation.

      The analysis of CpG-disrupting mutations is interesting but raises two concerns. First, other classes of variants-such as transcription factor binding site-disrupting mutations-could also influence local methylation patterns and are not considered here. Second, the causal direction remains ambiguous: CpG-disrupting mutations may result from methylation-associated mutational processes (e.g., C→T transitions at methylated CpGs) rather than being the primary drivers of methylation divergence. While additional analyses may not be necessary, explicitly acknowledging these alternative explanations would strengthen the interpretation.

      Regarding the discussion comparing the distance between CpG-disrupting SNVs and trans-DMRs, without information on the absolute or relative distance distributions, it was difficult to assess the magnitude of the observed differences. Moreover, trans-DMRs, by definition, are not driven by local (cis) variation, and the lack of proximity to CpG-disrupting SNVs is expected. Clarifying what additional insight this analysis provides beyond this expectation may improve this section.

      One potential extension would be to examine whether the same cis-acting SNVs are consistently associated with methylation differences across multiple cell types. If these variants are mechanistically causal, one might expect their effects to be reproducible across contexts, or at least more frequent than expected by chance. Such an analysis could further support the proposed link between sequence variation and methylation divergence.

      Regarding their two-step sign test analysis, because enrichment-based approaches can sometimes overemphasize statistical significance without reflecting effect size, I wonder if incorporating the magnitude of methylation change would provide additional information or strengthen these findings. While the authors highlight some cases, such as TUBB2 and GRIK, a more general overview and/or integration of effect size into the analysis or discussion would improve interpretability.

    4. Reviewer #3 (Public review):

      Summary:

      Ma et al. use human-chimpanzee tetraploid cells to examine species differences in DNA methylation. They identify differentially methylated regions under cis or trans regulation. Cis-DMRs are enriched near SNVs that disrupt or create CpGs, providing a plausible mechanism for cis changes in methylation. They also seek to identify transcription factors that might affect methylation in trans, as well as gene sets with evidence for consistent changes in methylation and expression between humans and chimpanzees.

      Strengths:

      The authors have generated a new dataset across multiple cell types, examining differences in DNA methylation between humans and chimpanzees using human diploid cells, chimpanzee diploid cells, and human-chimpanzee tetraploid cells. Using this dataset, they identify that cis-DMRs are enriched near SNVs that disrupt or create CpGs compared to trans-DMRs, and identify transcription factors as candidate trans-acting factors. Both identified SNVs and transcription factors are good candidates for future experimentation. Further, they find that cis-DMRs are more highly correlated with cis-expressed genes than trans-DMRs with trans-expressed genes, providing evidence that methylation and expression are linked genome-wide.

      Weaknesses:

      The authors could greatly improve the manuscript by focusing on two issues.

      (1) Strengthening their cis/trans analysis, including:<br /> a) only showing or analyzing genomic regions that pass FDR correction;<br /> b) clarifying how cis genes are defined (Figure 2B shows some genes labeled as cis where the direction-of-effect differs between hybrid and parent cells);<br /> c) assessing how well powered they are to perform each analysis.

      (2) Softening claims about human evolution or human specificity for several reasons:<br /> a) Their comparison lacks tetraploid controls (e.g. human-human tetraploids and chimp-chimp tetraploids) or experimental follow-up in diploid cells, making it hard to be certain that observed effects are not due to ploidy.<br /> b) There are no outgroup species included in the analysis.<br /> c) The use of no or very loose FDR corrections with the sign test makes it difficult to draw conclusions.<br /> d) Experimental data to link SNVs to changes in cis methylation or identified transcription factors to changes in trans methylation would be needed to validate the authors' predictions.

    1. eLife Assessment

      This study investigates the role of developmental oligodendrocytes in synchronising spontaneous activity in neuronal circuits and influencing cerebellar-dependent behaviour. The authors use advanced viral targeting techniques to deplete oligodendrocytes in a cell-specific manner, paired with in vivo calcium imaging of Purkinje cells, to establish a relationship between oligodendrocyte-mediated neuronal synchrony and complex brain function. The authors present compelling evidence of oligodendrocyte-regulated neuronal synchrony. Overall, this manuscript holds promise as an important contribution to neurodevelopment research.

    2. Reviewer #1 (Public review):

      [Editor's note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have satisfactorily addressed the previous concerns raised by the reviewers.]

      Summary:

      This study presents convincing findings that oligodendrocytes play a regulatory role in spontaneous neural activity synchronization during early postnatal development, with implications for adult brain function. Utilizing targeted genetic approaches, the authors demonstrate how oligodendrocyte depletion impacts Purkinje cell activity and behaviors dependent on cerebellar function. Delayed myelination during critical developmental windows is linked to persistent alterations in neural circuit function, underscoring the lasting impact of oligodendrocyte activity.

      Strengths:

      (1) The research leverages the anatomically distinct olivocerebellar circuit, a well-characterized system with known developmental timelines and inputs, strengthening the link between oligodendrocyte function and neural synchronization.

      (2) Functional assessments, supported by behavioral tests, validate the findings of in vivo calcium imaging, enhancing the study's credibility.

      (3) Extending the study to assess long-term effects of early life myelination disruptions adds depth to the implications for both circuit function and behavior.

      Weaknesses:

      (1) The study would benefit from a closer analysis of myelination during the periods when synchrony is recorded. Direct correlations between myelination and synchronized activity would substantiate the mechanistic link and clarify if observed behavioral deficits stem from altered myelination timing.

      (2) Although the study focuses on Purkinje cells in the cerebellum, neural synchrony typically involves cross-regional interactions. Expanding the discussion on how localized Purkinje synchrony affects broader behaviors-such as anxiety, motor function, and sociality - would enhance the findings' functional significance.

      (3) The authors discuss the possibility of oligodendrocyte-mediated synapse elimination as a possible mechanism behind their findings, drawing from relevant recent literature on oligodendrocyte precursor cells. However, there are no data presented supporting these assumptions. The authors should explain why they think the mechanism behind their observation extends beyond the contribution of myelination or remove this point from the discussion entirely.

      Comment for resubmission: Although the argument on synaptic elimination has been removed, it has been replaced with similarly unclear speculation about roles for oligodendrocytes outside of conventional myelination or metabolic support, again without clear evidence. The authors measured MBP area but have not performed detailed analysis of oligodendrocyte biology to support the claims made in the discussion. Please consider removing this section or rephrasing it to align with the data presented.

      (4) It would be valuable to investigate secondary effects of oligodendrocyte depletion on other glial cells, particularly astrocytes or microglia, which could influence long-term behavioral outcomes. Identifying whether the lasting effects stem from developmental oligodendrocyte function alone or also involve myelination could deepen the study's insights.

      (5) The authors should explore the use of different methods to disturb myelin production for a longer time, in order to further determine if the observed effects are transient or if they could have longer-lasting effects.

      (6) Throughout the paper, there are concerns about statistical analyses, particularly on the use of the Mann-Whitney test or using fields of view as biological replicates.

    3. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This study presents convincing findings that oligodendrocytes play a regulatory role in spontaneous neural activity synchronisation during early postnatal development, with implications for adult brain function. Utilising targeted genetic approaches, the authors demonstrate how oligodendrocyte depletion impacts Purkinje cell activity and behaviours dependent on cerebellar function. Delayed myelination during critical developmental windows is linked to persistent alterations in neural circuit function, underscoring the lasting impact of oligodendrocyte activity.

      Strengths:

      (1) The research leverages the anatomically distinct olivocerebellar circuit, a well-characterized system with known developmental timelines and inputs, strengthening the link between oligodendrocyte function and neural synchronization.

      (2) Functional assessments, supported by behavioral tests, validate the findings of in vivo calcium imaging, enhancing the study's credibility.

      (3) Extending the study to assess the long-term effects of early-life myelination disruptions adds depth to the implications for both circuit function and behavior.

      We appreciate these positive evaluation.

      Weaknesses:

      (1) The study would benefit from a closer analysis of myelination during the periods when synchrony is recorded. Direct correlations between myelination and synchronized activity would substantiate the mechanistic link and clarify if observed behavioral deficits stem from altered myelination timing.

      We appreciate the reviewer’s thoughtful suggestion and have expanded the manuscript to clarify how oligodendrocyte maturation relates to the development of Purkinje-cell synchrony. The developmental trajectory of Purkinje-cell synchrony has already been comprehensively characterized by Good et al. (2017, Cell Reports 21: 2066–2073): synchrony drops from a high level at P3–P5 to adult-like values by P8. We found that the myelination in the cerebellum starts to appear from P5-P7 (Figure S1A, B), indicating that the timing of Purkinje cell desynchronization coincides with the initial appearance of oligodendrocytes and myelin in the cerebellum. To determine whether myelin growth could nevertheless modulate this process, we quantified ASPA-positive oligodendrocyte density and MBP-positive bundle thickness and area at P10, P14, P21 and adulthood (Fig. 1J, K, Fig. S1E). Both metrics increase monotonically and clearly lag behind the rapid drop in synchrony, indicating that myelination could be not the primary trigger for the desynchronization. When oligodendrocytes were ablated during the second postnatal week, the synchrony was reduced (new Fig. 2). Thus, once myelination is underway, oligodendrocytes become critical for maintaining the synchrony, acting not as the initiators but as the stabilizers and refiners of the mature network state.

      We have added the new subsection in discussion (lines 451–467) now in which we propose a two-phase model. Phase I (P3–P8): High early synchrony is generated by non-myelin mechanisms (e.g. transient gap junctions, shared climbing-fiber input). Phase II (P8-). As oligodendrocytes proliferate and ensheath axons, they fine-tune conduction velocity and stabilize the mature, low-synchrony network state.

      We believe these additions fully address the reviewer’s concerns.

      (2) Although the study focuses on Purkinje cells in the cerebellum, neural synchrony typically involves cross-regional interactions. Expanding the discussion on how localized Purkinje synchrony affects broader behaviors - such as anxiety, motor function, and sociality - would enhance the findings' functional significance.

      We appreciate the reviewer’s helpful suggestion and have expanded the Discussion (lines 543–564) to clarify how localized Purkinje-cell synchrony can influence broader behavioral domains. In the revised text we note that changes in PC synchrony propagate into thalamic, prefrontal, limbic, and parietal targets, thereby impacting distributed networks involved in motor coordination, affect, and social interaction. Our optogenetic rescue experiments further support this framework, as transient resynchronization of PCs normalized sociability and motor coordination while leaving anxiety-like behavior impaired. This dissociation highlights that different behavioral domains rely to varying degrees on precise cerebellar synchrony and underscores how even localized perturbations in Purkinje timing can acquire system-level significance.

      (3) The authors discuss the possibility of oligodendrocyte-mediated synapse elimination as a possible mechanism behind their findings, drawing from relevant recent literature on oligodendrocyte precursor cells. However, there are no data presented supporting this assumption. The authors should explain why they think the mechanism behind their observation extends beyond the contribution of myelination or remove this point from the discussion entirely.

      We thank the reviewer for pointing out that our original discussion of oligodendrocyte-mediated synapse elimination was not directly supported by data in the present manuscript. Because we are actively analyzing this question in a separate, follow-up study, we have deleted the speculative passage to keep the current paper focused on the demonstrated, myelination-dependent effects. We believe this change sharpens the mechanistic narrative and fully addresses the reviewer’s concern.

      (4) It would be valuable to investigate the secondary effects of oligodendrocyte depletion on other glial cells, particularly astrocytes or microglia, which could influence long-term behavioral outcomes. Identifying whether the lasting effects stem from developmental oligodendrocyte function alone or also involve myelination could deepen the study's insights.

      We thank the reviewer for raising this point and have performed the requested analyses. Using IBA1 immunostaining for microglia and S100b for Bergmann glia, we quantified cell density and these marker signal intensity at P14 and P21. Neither microglial or Bergmann-glial differed between control and oligodendrocyte-ablated mice at either time‐point (new Figure S2). These results indicate that the behavioral phenotypes we report are unlikely to arise from secondary activation or loss of other glial populations.

      We now added results (lines 275–286) and also discuss myelination and other oligodendrocyte function (lines 443–450). It remains difficult to disentangle conduction-related effects from myelination-independent trophic roles of oligodendrocytes. We therefore note explicitly that future work employing stage-specific genetic tools or acute metabolic manipulations will be required to parse these contributions more definitively.

      (5) The authors should explore the use of different methods to disturb myelin production for a longer time, in order to further determine if the observed effects are transient or if they could have longer-lasting effects.

      We agree that distinguishing transient from enduring effects is critical. Importantly, our original submission already included data demonstrating a persistent deficit of PC population synchrony (Fig. 4, previous Fig. 3): (i) at P14—the early age after oligodendrocyte ablation—population synchrony is reduced, and (ii) the same deficit is still present in adults (P60–P70) despite full recovery of ASPA-positive cell density and MBP-area and -thickness (Fig. 2H-K, Fig. S1E, and Fig. 4). We also performed the ablation of oligodendrocytes after the third postnatal week. Despite a similar acute drop in ASPA-positive cells, neither population synchrony nor anxiety-, motor-, or social behaviors differed from littermate controls. Thus, extending myelin disruption beyond the developmental window does not exacerbate or prolong the phenotype, whereas a short perturbation within that window leaves a permanent timing defect. These findings strengthen our conclusion that it is the developmental oligodendrocyte/myelination program itself—rather than ongoing adult myelin production—that is essential for establishing stable network synchrony. We now highlight this point explicitly in the revised Discussion (lines 507–522).

      (6) Throughout the paper, there are concerns about statistical analyses, particularly on the use of the Mann-Whitney test or using fields of view as biological replicates.

      We appreciate the reviewer’s guidance on appropriate statistical treatment. To address these concerns we have re-analyzed all datasets that contained multiple measurements per animal (e.g., fields of view, lobules, or trials) using nested statistics with animal as the higher-order unit. Specifically, we applied a two-level nested ANOVA when more than two groups were compared and a nested t-test when two conditions were present. The re-analysis confirmed all original conclusions. Because the nested models yielded comparable effect sizes to the Mann–Whitney tests, we have retained the mean ± SEM for ease of comparison with prior literature but now also report all values for each mouse in Table 1. In cases where a single measurement per mouse was compared between two groups, we used the Mann–Whitney test and present the results in the graphs as median values.

    1. eLife Assessment

      In this important work, it is demonstrated that certain high-resolution cryo-EM structures can be obtained by using concentrated cell extracts without purification. The compelling results with the mammalian ribosomes demonstrate the utility of this approach for this molecule and complexes with elongation factor 2. Moreover, this work also demonstrates the utility of 2D template matching for particle picking for structure determination by single-particle averaging pipelines.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript by Seraj et al. introduces a transformative structural biology methodology termed "in extracto cryo-EM." This approach circumvents the traditional, often destructive, purification processes by performing single-particle cryo-EM directly on crude cellular lysates. By utilizing high-resolution 2D template matching (2DTM), the authors localize ribosomal particles within a complex molecular "crowd," achieving near-atomic resolution (~2.2 Å). The biological centerpiece of the study is the characterization of the mammalian translational apparatus under varying physiological states. The authors identify elongation factor 2 (eEF2) as a nearly universal hibernation factor, remarkably present not only on non-translating 80S ribosomes but also on 60S subunits. The study provides a detailed structural atlas of how eEF2, alongside factors like SERBP1, LARP1, and IFRD2, protects the ribosome's most sensitive functional centers (the PTC, DC, and SRL) during cellular stress.

      Strengths:

      The "in extracto" approach is a significant leap forward. It offers the high resolution typically reserved for purified samples while maintaining the "molecular context" found in in situ studies. This addresses a major bottleneck in structural biology: the loss of transiently bound or labile factors during biochemical purification.

      The finding that eEF2 binds and sequesters 60S subunits is a major biological insight. This suggests a "pre-assembly" hibernation state that allows for rapid mobilization of the translation machinery once stress is relieved, which was previously uncharacterized in mammalian cells.

      The authors successfully captured eIF5A and various hibernation factors in states that are traditionally disrupted. The identification of eIF5A across nearly all translating and non-translating states highlights the power of this method to detect ubiquitous but weakly bound regulators.

      The manuscript beautifully illustrates the "shielding" mechanism of the ribosome. By mapping the binding sites of eEF2 and its co-factors, the authors provide a clear chemical basis for how the cell prevents nucleolytic cleavage of ribosomal RNA during nutrient deprivation.

      Weaknesses:

      While 2DTM is a powerful search tool, it inherently relies on a known structural "template." There is a risk that this methodology may be "blind" to highly divergent or novel macromolecular complexes that do not share sufficient structural similarity with the search model. The authors should discuss the limitations of using a vacant 60S/80S template in identifying highly remodeled stress-induced complexes. For instance, what happens if an empty 40S subunit is used as template? In the current work, while 60S and 80S particles are picked, none are 40S. The authors should comment on this.

      In the GTPase center, the authors identify density for "DRG-like" proteins. However, due to limited local resolution in that specific region, they are unable to definitively distinguish between DRG1 and DRG2. While the structural similarity is high, the functional implications differ, and the identification remains somewhat speculative. The authors should acknowledge this in the text.

      While "in extracto" is superior to purified SPA, the act of cell lysis (even rapid permeabilization) still involves a change in the chemical environment (pH, ion concentration, and dilution of metabolites). The authors could strengthen the manuscript by discussing how post-lysis changes might affect the occupancy of factors like GTP vs. GDP states.

      The study provides excellent snapshots of stationary states (translating vs. hibernating), but the kinetic transition-specifically how the 60S-eEF2 complex is recruited back into active translation-is not well discussed. On page 13, the authors present eEF2 bound to 60S but do not mention anything regarding which nucleotide is bound to the factor. It only becomes clear that it is GDP after looking at Figure S9. This should be clarified in the text. Similarly, the observations that eEF2 is bound to GDP in the 60S and 80S raises the questions as to how the factor dissociates from the ribosome. This could also be discussed.

      Overall Assessment:

      This work reported in this manuscript likely represents the future of structural proteomics. The combination of high-resolution structural biology with minimal sample perturbation provides a new standard for investigating the cellular machines that govern life. After addressing minor points regarding template bias, protein identification, and transition dynamics, this work may become a landmark in the field of translation.

      Comments on revisions:

      In the revised version of the manuscript, the authors have addressed my prior concerns.

    3. Reviewer #2 (Public review):

      In this manuscript, the authors describe using "in extracto" cryo-EM to obtain high-resolution structures of mammalian ribosomes from concentrated cell extracts without further purification or reconstitution. This approach aims to solve two related problems. The first is that purified ribosomes often lose cellular cofactors, which are often reconstituted in vitro; this precludes the ability to find novel interactions. The second is that while it is possible to perform cryo-EM on cellular lamella, FIB milling is a slow and laborious process, making it unfeasible to collect datasets sufficiently large to allow for high-resolution structure determination. Extracts should contain all cellular cofactors and allow for grid preparation similar to standard single-particle analysis (SPA) approaches. While cryo-EM of cell extracts is not in itself novel, this manuscript uses 2D template matching (2DTM) for particle picking prior to structure determination using more standard SPA pipelines. This should allow for improved picking over other approaches, in order to obtain in large datasets for high-resolution SPA.

      This manuscript has two main results: novel structures of ribosomes in hibernating states; and a proof-of-principle for in extracto cryo-EM using 2DTM. Overall, I think the results presented here are strong and serve as a proof-of-principle for an approach that may be useful to many others.

      Comments on revisions:

      This current draft addresses my prior comments regarding usability for readers through the addition of text describing how parameters were optimized as well as an additional supplementary figure outlining the processing workflow. With these additions, I have no further comments.

    4. Reviewer #3 (Public review):

      Summary:

      The authors describe a new structural biology framework termed "in extracto cryo-EM," which aims to bridge the gap between single-particle cryo-EM of purified complexes and in situ cryo-electron tomography (cryo-ET). By utilizing high-resolution 2D template matching (2DTM) on mammalian cell lysates, the authors sought to visualize the translational apparatus in a near-native environment while maintaining near-atomic resolution. The study identifies elongation factor 2 (eEF2) as a major hibernation factor bound to both 60S and 80S particles and describes a variety of hibernation scenarios involving factors such as SERBP1, LARP1, and CCDC124.

      Strengths:

      (1)The use of 2DTM effectively overcomes the signal-to-noise challenges posed by the dense and viscous nature of cellular extracts, yielding maps as high as 2.2 Å.<br /> (2)The discovery of eEF2-GDP as a ubiquitous shield for ribosomal functional centers, particularly its unexpected stabilization on the 60S subunit, provides a compelling model for ribosome preservation during stress.

      Weaknesses:

      (1) Representative nature of cell samples and lower detection limit

      The cells used in this study (MCF-7, BSC-1, and RRL) are either fast-growing cancer cell lines or specialized protein-synthetic systems. For cells with naturally low ribosomal abundance (such as quiescent primary cells), achieving the target concentration (e.g., A260 > 1000 ng/uL) would require an exponentially larger starting cell population.

      Is there a defined lower limit of ribosomal concentration in the raw lysate below which the 2DTM algorithm fails to yield high-resolution classes? In ribosome-sparse lysates, A260 becomes an unreliable proxy for ribosome density due to the high background of other RNA species and proteins. How do the authors estimate specific ribosome abundance in such heterogeneous fields?

      (2) Quantitation in heterogeneous lysates and crowding effects

      The authors utilize A260 as a key quality control measure before grid preparation. However, if extreme physical concentration is required to see enough particles, the background concentration of other cytoplasmic components also increases. This may lead to molecular crowding or sample viscosity that interferes with the formation of optimal thin ice. How do the authors calculate or estimate the specific abundance of ribosomes in the cryo-EM field of view when they represent a much smaller percentage of the total cellular content?

      (3) Optimization of sample preparation

      The authors describe lysates as dense and viscous, requiring multiple blotting steps (2-3 times) for 3-8 seconds. Have the authors tested whether a larger molecular weight cutoff (e.g., 100 kDa) during concentration could improve the ribosome-to-background ratio without losing small factors like eIF5A (approx. 17 kDa)? Could repeated blotting of a concentrated, viscous lysate introduce shearing forces or increased exposure to the air-water interface that perturbs the native conformation of the complexes?

      (4) The regulatory switch and mechanism of eEF2

      The finding that eEF2-GDP occupies dormant ribosomes is striking. What drives eEF2 from its canonical role in translocation to this hibernation state? Is this transition purely driven by stoichiometry (lack of mRNA/tRNA) and the GDP/GTP ratio, or is there a role for post-translational modifications? How do these eEF2-bound dormant ribosomes rapidly re-enter the translation pool upon stress relief?

      (5) Hibernation diversity and LARP1 contextualization

      The study reveals that hibernation strategies vary across cell types. Does the high hibernation rate in RRL reflect a physiological state, or does it hint at "preparation-induced stress" due to resource exhaustion or mRNA degradation in the cell-free system? How do the authors reconcile their discovery of LARP1 on 80S particles with recent 2024 reports that primarily describe LARP1 as an SSU-bound repressor?

      Comments on revisions:

      The authors have addressed the issues I had raised in my initial review. The additional data and clarifications provided in the revision are satisfactory. I have no further recommendations.<br /> Thanks to the authors for their efforts.

    5. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      In this important work, it is demonstrated that certain high-resolution cryo-EM structures can be obtained by using concentrated cell extracts without purification. The compelling results with the mammalian ribosomes demonstrate the utility of this approach for this molecule and complexes with elongation factor 2. Moreover, this work also demonstrates the utility of 2D template matching for particle picking for structure determination by single-particle averaging pipelines.

      We thank the reviewers for their valuable comments and suggestions, which have helped us to improve the manuscript. We provide a response to the referees’ comments below.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The manuscript by Seraj et al. introduces a transformative structural biology methodology termed "in extracto cryo-EM." This approach circumvents the traditional, often destructive, purification processes by performing single-particle cryo-EM directly on crude cellular lysates. By utilizing high-resolution 2D template matching (2DTM), the authors localize ribosomal particles within a complex molecular "crowd," achieving near-atomic resolution (~2.2 Å). The biological centerpiece of the study is the characterization of the mammalian translational apparatus under varying physiological states. The authors identify elongation factor 2 (eEF2) as a nearly universal hibernation factor, remarkably present not only on non-translating 80S ribosomes but also on 60S subunits. The study provides a detailed structural atlas of how eEF2, alongside factors like SERBP1, LARP1, and IFRD2, protects the ribosome's most sensitive functional centers (the PTC, DC, and SRL) during cellular stress.

      Strengths:

      The "in extracto" approach is a significant leap forward. It offers the high resolution typically reserved for purified samples while maintaining the "molecular context" found in in situ studies. This addresses a major bottleneck in structural biology: the loss of transiently bound or labile factors during biochemical purification.

      The finding that eEF2 binds and sequesters 60S subunits is a major biological insight. This suggests a "pre-assembly" hibernation state that allows for rapid mobilization of the translation machinery once stress is relieved, which was previously uncharacterized in mammalian cells.

      The authors successfully captured eIF5A and various hibernation factors in states that are traditionally disrupted. The identification of eIF5A across nearly all translating and non-translating states highlights the power of this method to detect ubiquitous but weakly bound regulators.

      The manuscript beautifully illustrates the "shielding" mechanism of the ribosome. By mapping the binding sites of eEF2 and its co-factors, the authors provide a clear chemical basis for how the cell prevents nucleolytic cleavage of ribosomal RNA during nutrient deprivation.

      Weaknesses:

      (1) While 2DTM is a powerful search tool, it inherently relies on a known structural "template." There is a risk that this methodology may be "blind" to highly divergent or novel macromolecular complexes that do not share sufficient structural similarity with the search model. The authors should discuss the limitations of using a vacant 60S/80S template in identifying highly remodeled stress-induced complexes. For instance, what happens if an empty 40S subunit is used as a template? In the current work, while 60S and 80S particles are picked, none are 40S. The authors should comment on this.

      Thank you for your comment. As noted by the reviewer, 2DTM inherently favors particles that share sufficient similarity with the search template and may underrepresent highly remodeled or structurally divergent complexes. Importantly, once particles are identified, subsequent 2D/3D classification and refinement are not constrained by the template used for particle picking. Consistent with this, we observe classes displaying additional or altered densities absent in the original template, indicating that template matching does not preclude the detection of remodeled ribosomal states, although highly divergent species may still escape detection.

      Regarding the use of a 40S subunit as a template for 2DTM, we tested two templates: a complete 40S subunit and the 40S body alone. Using these 40S templates, we captured several 40S-, 43S-, and 48S-containing complexes, as well as 80S particles. As expected, no individual 60S classes emerge with 40S-TM. 40S-TM yielded 80S classes similar to those with 60-TM, although the number of particles was lower than that in 60S template matching, resulting in lower resolution of these classes. Since this study focuses on ribosome hibernation, we chose to proceed with the 60S-TM results and do not report results using 40S-TM. We reported 40S-TM results in another study from our groups (Zottig et al., bioRxiv, 2025), which focuses on translation initiation on 40S subunits and was deposited as preprint after this submission.

      We have added a comment and reference describing the use of the 40S template in the initial section of Results and Discussion: “This result echoes our concurrent finding that using 40S or partial 40S templates yields a variety of initiation complexes and 80S classes, revealing densities beyond those in the template [44].”

      (2) In the GTPase center, the authors identify density for "DRG-like" proteins. However, due to limited local resolution in that specific region, they are unable to definitively distinguish between DRG1 and DRG2. While the structural similarity is high, the functional implications differ, and the identification remains somewhat speculative. The authors should acknowledge this in the text.

      We agree with this comment and address it in the main text:

      “Whereas the overall shape and secondary structure resemble DRG1 or DRG2, the local resolution is insufficient to distinguish between these or other similarly structured proteins. Both yeast and mammalian counterparts are reported to function with a companion factor (Tma146p or Gir2 in yeast; or DFRP1 and DFRP2 in mammals), but our maps do not contain density that could correspond to DFRP1/2 near the putative DRG1/2 density. Future work will elucidate the function of these or other DRG-like GTPases in the context of an elongation complex.”

      (3) While "in extracto" is superior to purified SPA, the act of cell lysis (even rapid permeabilization) still involves a change in the chemical environment (pH, ion concentration, and dilution of metabolites). The authors could strengthen the manuscript by discussing how post-lysis changes might affect the occupancy of factors like GTP vs. GDP states.

      Thank you for pointing this out. Cell lysis can indeed lead to a change in the chemical environment, although we do not know how post-lysis changes may specifically affect the occupancy of factors, such as GTP- vs. GDP-bound states. We tried to minimize this effect by performing a rapid permeabilization. Our efforts to optimize our protocols are ongoing, and we expect to have a better answer to this question in the future.

      Nevertheless, to address this reviewer’s concern, our discussion states: “Additional optimization of buffer conditions may be required to more accurately represent the translation states observed in cells, as ionic conditions are known to affect the conformation of the ribosomes (e.g. rotated/non-rotated) and binding of protein factors”.

      (4) The study provides excellent snapshots of stationary states (translating vs. hibernating), but the kinetic transition, specifically how the 60S-eEF2 complex is recruited back into active translation, is not well discussed. On page 13, the authors present eEF2 bound to 60S but do not mention anything regarding which nucleotide is bound to the factor. It only becomes clear that it is GDP after looking at Figure S9. This should be clarified in the text. Similarly, the observations that eEF2 is bound to GDP in the 60S and 80S raise questions as to how the factor dissociates from the ribosome. This could also be discussed.

      Thank you for bringing this to our attention. We now state in the main text that eEF2 is bound with GDP on the 60S subunit.

      As for the kinetic transitions of 60S-eEF2 complexes, like this reviewer, we are fascinated by the possible roles and mechanisms of the 60S-eEF2 complex. The averaged particle ensembles derived from cryo-EM data do not report on the kinetics or transition pathways directly. We acknowledge in the main text that “Future studies will bring insights into the roles of the protein(s) and into the functions and transitions of 60S•eEF2 complexes to the pool of translating ribosomes”.

      Overall Assessment:

      The work reported in this manuscript likely represents the future of structural proteomics. The combination of high-resolution structural biology with minimal sample perturbation provides a new standard for investigating the cellular machines that govern life. After addressing minor points regarding template bias, protein identification, and transition dynamics, this work may become a landmark in the field of translation.

      Reviewer #2 (Public review):

      In this manuscript, the authors describe using "in extracto" cryo-EM to obtain high-resolution structures of mammalian ribosomes from concentrated cell extracts without further purification or reconstitution. This approach aims to solve two related problems. The first is that purified ribosomes often lose cellular cofactors, which are often reconstituted in vitro; this precludes the ability to find novel interactions. The second is that while it is possible to perform cryo-EM on cellular lamella, FIB milling is a slow and laborious process, making it unfeasible to collect datasets sufficiently large to allow for high-resolution structure determination. Extracts should contain all cellular cofactors and allow for grid preparation similar to standard single-particle analysis (SPA) approaches. While cryo-EM of cell extracts is not in itself novel, this manuscript uses 2D template matching (2DTM) for particle picking prior to structure determination using more standard SPA pipelines. This should allow for improved picking over other approaches in order to obtain large datasets for high-resolution SPA.

      This manuscript has two main results: novel structures of ribosomes in hibernating states; and a proof-of-principle for in extracto cryo-EM using 2DTM. Overall, I think the results presented here are strong and serve as a proof-of-principle for an approach that may be useful to many others. However, without presenting the logic of how parameters were optimized, this manuscript is limited in its direct utility to readers.

      Thank you for this valuable comment. We have expanded our Methods section “Optimization of 2DTM in RRL data “to present the logic behind parameter optimization, with the paragraph beginning with “We optimized high-resolution template matching procedures…”

      Reviewer #3 (Public review):

      Summary:

      The authors describe a new structural biology framework termed "in extracto cryo-EM," which aims to bridge the gap between single-particle cryo-EM of purified complexes and in situ cryo-electron tomography (cryo-ET). By utilizing high-resolution 2D template matching (2DTM) on mammalian cell lysates, the authors sought to visualize the translational apparatus in a near-native environment while maintaining near-atomic resolution. The study identifies elongation factor 2 (eEF2) as a major hibernation factor bound to both 60S and 80S particles and describes a variety of hibernation scenarios involving factors such as SERBP1, LARP1, and CCDC124.

      Strengths:

      (1) The use of 2DTM effectively overcomes the signal-to-noise challenges posed by the dense and viscous nature of cellular extracts, yielding maps as high as 2.2 Å.

      (2) The discovery of eEF2-GDP as a ubiquitous shield for ribosomal functional centers, particularly its unexpected stabilization on the 60S subunit, provides a compelling model for ribosome preservation during stress.

      Weaknesses:

      (1) Representative nature of cell samples and lower detection limit

      The cells used in this study (MCF-7, BSC-1, and RRL) are either fast-growing cancer cell lines or specialized protein-synthetic systems. For cells with naturally low ribosomal abundance (such as quiescent primary cells), achieving the target concentration (e.g., A260 > 1000 ng/uL) would require an exponentially larger starting cell population.

      Is there a defined lower limit of ribosomal concentration in the raw lysate below which the 2DTM algorithm fails to yield high-resolution classes? In ribosome-sparse lysates, A260 becomes an unreliable proxy for ribosome density due to the high background of other RNA species and proteins. How do the authors estimate specific ribosome abundance in such heterogeneous fields?

      We have not tested these specific points, but we found that 2DTM can successfully result in high-resolution reconstructions even with 1-2 particles per micrograph. This would require a substantially larger dataset than in this work yet could provide a viable strategy for diluted or low-abundance samples. Other optimizations, including lysate concentration, may help as well. We have the following text to reflect these points:

      “Additional optimization of buffer conditions may be required to more accurately represent the translation states observed in cells, as ionic conditions are known to affect the conformation of the ribosomes (e.g. rotated/non-rotated) and binding of protein factors [91-94]. For cells or samples with lower abundance of ribosomes or other macromolecules/complexes of interest, a lysate concentration step or collection of a larger dataset may be considered.”

      (2) Quantitation in heterogeneous lysates and crowding effects

      The authors utilize A260 as a key quality control measure before grid preparation. However, if extreme physical concentration is required to see enough particles, the background concentration of other cytoplasmic components also increases. This may lead to molecular crowding or sample viscosity that interferes with the formation of optimal thin ice. How do the authors calculate or estimate the specific abundance of ribosomes in the cryo-EM field of view when they represent a much smaller percentage of the total cellular content?

      We reported A260 as a reference that may be useful to achieve particle distributions resembling those in our work, rather than as a key quality control measure. Accordingly, we do not use it to estimate ribosome concentration or the specific abundance of ribosomes; instead, we’d recommend adjusting the sample concentration/dilution by grid screening.

      This reviewer mentions the important aspect of ice thickness. We found that the highest population of ribosome particles is found in thicker ice regions, and these particles have been used to make up the majority of our datasets leading to high-resolution reconstructions. We have added this observation to “Optimization of 2DTM in RRL data”.

      (3) Optimization of sample preparation

      The authors describe lysates as dense and viscous, requiring multiple blotting steps (2-3 times) for 3-8 seconds. Have the authors tested whether a larger molecular weight cutoff (e.g., 100 kDa) during concentration could improve the ribosome-to-background ratio without losing small factors like eIF5A (approx. 17 kDa)? Could repeated blotting of a concentrated, viscous lysate introduce shearing forces or increased exposure to the air-water interface that perturbs the native conformation of the complexes?

      We strived to minimize the number of steps in sample preparation, so we did not extensively test concentration steps. We also found that a concentration step can be omitted; the eIF5A-containing structure from the RRL dataset was determined without this step. We agree with the reviewer that repeated blotting may change ribosome complex equilibrium and result in a different distribution of functional states than in cells. However, we did not find evidence of perturbation of the native conformations of complexes, as the positions of ribosomes and factors are nearly identical to those observed in previous studies, including the recent high-resolution structures from cells that we cite.

      (4) The regulatory switch and mechanism of eEF2

      The finding that eEF2-GDP occupies dormant ribosomes is striking. What drives eEF2 from its canonical role in translocation to this hibernation state? Is this transition purely driven by stoichiometry (lack of mRNA/tRNA) and the GDP/GTP ratio, or is there a role for post-translational modifications? How do these eEF2-bound dormant ribosomes rapidly re-enter the translation pool upon stress relief?

      We are glad that this reviewer is fascinated by the eEF2-GDP occupancy on dormant ribosome (just like we are)! These are important open questions that require further research, as our cryo-EM analyses cannot directly address the kinetic or mechanistic aspects of the mentioned processes. We did explore the known modification/phosphorylation sites in eEF2 densities but did not find evidence for such modifications, which does not rule out the possibility of transient or new modifications.

      (5) Hibernation diversity and LARP1 contextualization

      The study reveals that hibernation strategies vary across cell types. Does the high hibernation rate in RRL reflect a physiological state, or does it hint at “preparation-induced stress” due to resource exhaustion or mRNA degradation in the cell-free system? How do the authors reconcile their discovery of LARP1 on 80S particles with recent 2024 reports that primarily describe LARP1 as an SSU-bound repressor?

      Based on the high abundance of hibernating ribosomes in RRL (relative to many other samples we have tested so far), we speculate that this scenario may result from the stresses induced during lysate preparation: first, the rabbits are treated with phenylhydrazine inducing cell stress, then lysates are treated with micrococcal nuclease to degrade endogenous mRNAs. In addition, the specialization of reticulocytes may contribute to the distinct expression of stress/hibernation factors.

      As for LARP1, our finding is consistent with the 2024 work by Saba et al, who reported LARP1 binding to both 40S subunits and 80S ribosomes. They also noted that LARP1-bound ribosomes are “non-translating”, consistent with our structures.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) In Figure 3, it would be easier for the reader if the authors would report the % of particles in each class. Also, indicating body rotation and head swiveling values would help.

      Because our high-resolution maps result from a combination of data sets (e.g., RRL with an mRNA and RRL without an mRNA), we specify the particle percentages in the corresponding classification schemes in supplemental figures. To avoid excessive labeling in this figures, body rotation and head swiveling values for the new classes are shown in Figure 4.

      (2) Page 16, what is 'elongation factor 1'? It doesn't seem the authors refer to eEF1A?

      Thank you for pointing out this inconsistency, this is indeed eEF1A. We have corrected the text.

      (3) Page 16, after 'individual 60S subunits', there is a missing full stop.

      Thanks. Corrected.

      Reviewer #2 (Recommendations for the authors):

      I am not an expert in ribosome biology and do not have any specific comments on the various states presented here. Instead, I will mainly focus on the image processing aspects of this manuscript.

      Major points:

      (1) Were any AI-based particle pickers, such as crYOLO, topaz, or warp tested? While more traditional template-based or LoG pickers were shown to be inferior to 2DTM, it is unclear if AI methods would perform just as well. Given that a major point of this manuscript is the image processing pipeline, and that these AI tools have been widely adopted in the field, I think this is an important consideration.

      We used other particle pickers before using 2DTM and have listed them in the Supplementary Information: please see Table S1 for a complete list of particle pickers evaluated in this study. Since our present work focuses on a sample preparation method, a more extensive evaluation of particle picking methods is beyond the scope of this study.

      (2) While the methods used to obtain the structures presented are detailed, I think it would also be useful to provide some logic for how parameters were determined or optimized. This would serve as a useful foundation for readers who wish to try out this in an extracto approach on their own specimens. Some of these optimizations seem quite specific, such as optimization of angular search parameters, but with no clear logic: e.g., why is the out-plane search coarser than the in-plane search; what is the effect of increasing the angular step sizes? Some seem inconsistent, e.g., why is e2pdb2mrc.py sometimes used and the cisTEM simulate used other times? Some are poorly described, such as "the defocus search turned on for micrographs with thicker ice" where there is no mention of how ice thickness is assessed and how thick is too thick. I think a workflow figure with accompanying text would help the reader understand the logic used in this work and how to apply that logic to their own projects.

      To address the comments in (2), we provide separate responses addressing each comment:

      (1) Provide some logic for how parameters were determined or optimized:

      The logic behind determining and optimizing search parameters is a balance between search precision and computational cost. In practice, users must weigh the benefit of finer sampling against the substantial increase in runtime, particularly for large datasets. For example, enabling defocus searching with a 200 Å step size and a 1000 Å range increases the computational time by approximately 11-fold compared to running the same search with defocus disabled (since each defocus plane in the positive and negative direction are searched), making such increases prohibitive, when GPU resources are limited. In such cases, reducing the defocus search to a 250 Å step size and a 500 Å range can dramatically shorten runtime while preserving nearly the same number of reliable matches. In summary, we found that optimizing the defocus search, in-plane, out-plane angles, and the image/micrograph pixel size can substantially reduce the processing speed while sacrificing only a small percentage of particles.

      We have expanded our parameter optimization paragraph in “Optimization of 2DTM in RRL data”, as mentioned in a previous response.

      (2) Some seem inconsistent, e.g., why is e2pdb2mrc.py sometimes used and the cisTEM simulate used other times?

      e2pdb2mrc.py is simpler to use and was used in the beginning of the project. Later, we switched to using the simulate program since it preformed slightly better. Either software is suitable to generate templates for 2DTM.

      (3) Some are poorly described, such as "the defocus search turned on for micrographs with thicker ice" where there is no mention of how ice thickness is assessed and how thick is too thick.

      We did not quantitatively assess ice thickness; instead, we tested whether it is advantageous to include the defocus search. To this end, we first performed CTF estimation and grouped micrographs based on their fit resolution. From each group, we selected ten micrographs representing the highest and lowest fit resolutions. Template matching was then performed using identical parameters, once with defocus search enabled and once with it disabled. The number of picked particles for each micrograph under both conditions was compared. When a significant difference was observed most commonly for icy micrographs with low fit resolution we enabled defocus search for that group of images. The difference between having the defocus search on vs off sometimes resulted in having 2x more matches. We found these images/datasets appeared to have a higher background compared to in-vitro reconstituted samples. The template-matching results from these micrographs were subsequently combined with results from groups processed with defocus search disabled.

      To address this point, we have included this description in “Optimization of 2DTM in RRL data”.

      (4) I think a workflow figure with accompanying text would help the reader understand the logic used in this work and how to apply that logic to their own projects.

      Thanks for this suggestion. We have added a workflow figure as Figure 1—figure supplement 2.

      Minor Points:

      (1) While the image processing described seems appropriate, I think it is still necessary to include Fourier shell correlation plots for the final structures as supplemental data.

      Thank you for pointing out this inadvertent omission. We have added FSC curves in Figure 3—figure supplement 3.

      (2) One of the initial workflows used is a Relion 3 pipeline, which is, at this point, quite dated. Is there a reason Relion 4 or 5 was not used instead?

      The project started when Relion 3 was the latest version.

    1. eLife Assessment

      This valuable study combines previously established mathematical models to investigate why cortical waves in starfish and Xenopus embryos propagate in opposite directions. The modeling results are solid and plausible, but remain experimentally untested. Improving the presentation and discussion of the results could make the study more accessible to a wider audience.

    2. Reviewer #1 (Public review):

      Summary:

      The main goal of this manuscript is to develop a mathematical model of the regulation of cortical dynamics by Cdk1 activity to explain why, in some embryos (e.g., Xenopus), surface contraction waves are believed to move in the same direction as Cdk1, while in other embryos (e.g., starfish) they are believed to move in the opposite direction.

      Strengths:

      (1) The paper addresses a very important question.

      (2) The mathematical model is sensible and suggests that the different relationship between Cdk1 and surface contraction waves might arise from the different behavior of the mitotic entry wave and the mitotic exit wave.

      (3) The authors propose a mechanism by which the wave observed at mitotic exit might not passively follow the trigger wave observed at mitotic entry'

      (4) The proposed mechanism is a potential explanation of the observed differences.

      (5) The proposed mechanism is centered on different dynamics between the nucleus and the cytoplasm, highlighting the potential importance of the nucleus (and nuclear size) in organizing cortical dynamics.

      Weaknesses:

      (1) The proposed mechanism works if the activity in the nucleus is much higher than the high activity (high state of the bistable system) of the cytoplasm. So, as the wave propagates across the cytoplasm, the activity around the nucleus remains higher, which potentially causes a delay in the onset of Cyclin B-Cdk1 degradation in the region around the nucleus compared to the surrounding cytoplasm. This effect happens over a typical length scale, and if such a length scale is comparable to embryo size, this becomes the predominant mechanism. However, such a mechanism should exist near the nucleus independently of embryo size. So, it seems that for embryos where the wave back and wave front should travel together, nuclear activity must be adjusted not to be much higher than cytoplasmic activity. A better discussion of the discovered process and its implications would strengthen the paper. It requires careful reading to understand what, in hindsight, is a rather simple explanation. Is there any experimental evidence that the overall activity of Cdk1 is higher in the nucleus than in the cytoplasm?

      (2) While the fact that Cdk1 can enslave cortical dynamics is clearly shown in the model, this is expected from the literature. There are systems where the enslavement of cortical and bulk actomyosin contractility to Cdk1 activity has been more clearly demonstrated (Drosophila and zebrafish embryos), as well as shown to have clear functions (nuclear positioning and ooplasmic segregation).

      (3) The writing could be improved. The authors make some claims of originality that seem a stretch, e.g., in the abstract, they say: "we develop a reaction-diffusion model of Cyclin B-Cdk1 signaling in spherical cells with localized nuclear activation", but they essentially use a previous model with a few numerical tweaks. The figures are sometimes mislabelled or not explained, and some of the units seem wrong.

      (4) The authors give the existence of trigger waves as a fact. While the predominant view is that such waves exist in the first cycle of the Xenopus embryos (however, this is from measurement of the cortical contractions, so a bit circular for this paper), it is unclear if waves exist in the starfish embryo, so the potential explanation that the starfish embryo simply has different Cdk1 dynamics cannot be ruled out.

    3. Reviewer #2 (Public review):

      Summary:

      Large oocytes show prominent waves of cortical contractions. Previous works combining experiments and computational modeling have shown that the waves are driven by gradients of CDK1 kinase activity that trigger excitable Rho activity patterns on the cortex. This present work combines two previously published mathematical models for CDK1 activation and Rho activation, respectively. They show that the models combined can explain diverse shapes of cortical contractions observed in different species and at various stages of development. This shows how the same molecular machinery can generate diverse patterns dependent on the size of the system and the size and position of the cell nucleus.

      Strengths:

      (1) Carefully done modeling work providing a simple and elegant explanation for a complex cellular behavior.

      (2) Very nicely illustrated, simulations can be directly compared to previous experimental observations.

      (3) Explains observations made in different model systems, providing a unifying model.

      Weaknesses:

      (1) Purely theoretical work, no experimental validation.

      (2) Adopts previously published models more or less 'as is', without detailed re-evaluation and re-assessment, or without developing them further.

      Overall, I find this work important, as it shows that combining models of the CDK1 gradient and Rho activation modules can explain the surface contraction waves observed in oocytes. Strikingly, it elegantly explains the differences seen between different experimental systems. While previously these were considered a 'controversy', modeling shows that the differences are simply a consequence of the difference in the size of the oocytes. In addition, the model makes several intriguing predictions that can be tested in future experiments.

    4. Reviewer #3 (Public review):

      Summary:

      Using realistic mathematical models, Cebrián-Lacasa et al. address the relationship between waves of activation of Cyclin B-Cdk1 that propagate through the cytoplasm of large (~1 mm) oocytes and fertilized eggs and surface contraction waves (SCWs) driven by Rho GTPase activity in the cell cortex. They present numerical simulations of the underlying reaction-diffusion equations that account in broad strokes for both the expected behavior of 'fronts' of Cdk1 activation (that propagate at constant velocity from the nucleus-the source of Cdk1 activity-to the cell cortex) and the unusual behavior of 'backs' of Cdk1 inactivation (that may propagate either away from or towards the nucleus, or exhibit simultaneous inactivation throughout the cytoplasm). They also model Rho GTPase activity in the cortex as an excitable system that propagates SCWs (target patterns, spiral waves, and more complicated patterns). When Cdk1 is activated in the cortex, it phosphorylates and inhibits the RhoGEF, Ect1, which suppresses SCWs by reducing Rho GTPase activity. As the wave-back of Cdk1 inactivation moves across the cortex, Rho GTPase activity recovers abruptly, and SCWs reappear as 'phase waves' whose speed and directionality are determined by the underlying cytoplasmic Cdk1 signal.

      Strengths:

      As a theoretical examination of an interesting and puzzling aspect of early embryonic development, this study shares the same strengths and weaknesses as all mathematical and computational approaches to molecular cell biology. The mathematical models are precise formulations of the underlying assumptions of the authors (which are quite reasonable in this reviewer's opinion), and the analysis and computational results are dependable consequences of the molecular mechanisms the authors have in mind. The model is expertly analyzed, and the results are both reliable and intriguing. The results are discussed in light of experimental evidence. Because the authors' methods and results suggest novel-and sometimes counterintuitive-avenues for experimental research, this paper is likely to have a significant impact on the field of Rho GTPase signaling in oocytes and early embryos, and perhaps in other cells as well.

      Weaknesses:

      Like all mathematical models, the underlying assumptions can be critiqued as neglecting this -or-that 'crucial' effect (e.g., mechanical coupling via cortical tension or cytoplasmic flow, as the authors acknowledge), and the highly technical methods of analysis and simulation can be unfamiliar and off-putting to experimental cell biologists. The paper is a difficult read, even for an experienced theoretician. For those who take the time to understand this paper, it may change the way they think about the coupling of cell cycle control (Cdk1 activation and inactivation) and cell surface contraction waves.

    1. eLife Assessment

      This study shows that combining forced cell cycle re-entry with Rbpj deletion enhances Müller glia dedifferentiation and promotes their conversion into retinal neuron-like cells in the uninjured mouse retina. It provides a valuable strategy for improving Müller glia-mediated neurogenesis and advancing regenerative potential in the mammalian retina. Overall, the data are convincing, but the conclusions would be strengthened by functional validation of the newly generated neurons and retinal performance, as well as an assessment of Müller glia long-term function and cell survival.

    2. Reviewer #1 (Public review):

      Summary:

      This study examines Müller glia (MG) reprogramming in the uninjured mouse retina through a combination of Notch signaling inhibition and AAV-induced proliferation. Building on their prior work showing that Cyclin D1 overexpression and p27^Kip1^ knockdown (CCA) promotes MG proliferation with very limited neurogenesis, the authors now demonstrate that Rbpj deletion alone induces a modest degree of MG-to-neuron conversion without proliferation, in agreement with recent work in the field. However, combining Rbpj deletion with CCA-mediated proliferation substantially enhances MG dedifferentiation and the generation of retinal neuron-like cells. Through genetic lineage tracing, histological analyses, and single-cell transcriptomics, the authors provide evidence that MG-derived cells acquire molecular features of bipolar (ON, OFF, and rod bipolar) and amacrine neurons. Most MG-derived cells appear to survive long-term (up to 9 months).

      Strengths:

      Overall, the study is carefully designed and executed, and the manuscript is clearly written with well-presented figures. While the work does not significantly expand the repertoire of neuronal types generated from mammalian MG beyond what has been previously reported in the field, it provides a valuable and improved strategy for inducing robust MG proliferation and neurogenesis in the mammalian retina.

      Weaknesses:

      (1) It would be better to include a negative control AAV when evaluating the effect of CCA AAV in the Rbpj KO background. This could help distinguish the specific contribution of the CCA construct from potential effects of intravitreal AAV injection itself, which can induce mild inflammation, known to influence MG reprogramming.

      (2) The extent of MG transduction by the CCA AAV is not clear. As quantifications are normalized to total MG (GFP^+^ or TdTomato^+^) or retinal length, it would be useful to clarify whether near-complete transduction is assumed, or if additional information on transduction efficiency can be provided.

      (3) In Figure S10, the reduced MG proliferation observed in the CCA + Rbpj deletion group could also potentially reflect decreased GFAP promoter activity in dedifferentiated MG following Rbpj deletion. Alternatively, MG-derived cells may be more fragile under these conditions.

      (4) In the CCA + Rbpj deletion condition, do MG undergo single or multiple rounds of cell division?

      (5) What fraction of neuron-like cells (bipolar- and amacrine-like) arises from proliferation versus direct transdifferentiation? Quantification of MG-derived cells expressing neuronal markers (e.g., Otx2, HuC/D), with and without EdU labeling, would help distinguish these mechanisms.

      (6) In Figure S18a, the authors state that "while the neuron-like clusters were best classified as BC-like and AC-like based on their distinct marker gene expression, they also exhibited mixed expression of genes associated with other retinal neuronal types, including RGC markers (e.g., Tubb3, Myt1l, Grin1) and photoreceptor markers (e.g., Crx, Prom1, Epha10, Gucy2e, Scg3) (Fig. S18a), suggesting that the regenerated cells exist in a hybrid state" and "MG derived neuron like cells also expressed genes characteristic of RGCs and photoreceptors, indicating enhanced lineage". However, many of these genes are not specific to RGCs or photoreceptors and are instead broadly expressed in retinal neurons or enriched in bipolar/amacrine populations. Therefore, it is unclear whether these cells exhibit hybrid RGC or photoreceptor identity.

      (7) The authors provide a thorough molecular characterization of MG-derived cells through immunostaining and single-cell sequencing. However, their morphological features, synaptic connectivity (e.g., synaptic marker expression), and electrophysiological properties remain largely uncharacterized. While these experiments may be technically challenging, this limitation should be discussed.

      (8) The conclusion that CCA + Rbpj deletion induces neurogenesis without compromising MG supportive functions or retinal homeostasis appears somewhat oversold. This claim is primarily based on gross retinal morphology and ZO-1 staining. Given the extent of MG dedifferentiation and ectopic cell generation in the ONL and INL, it is likely that retinal function is affected. Functional assessments (e.g., ERG) would be required to support this conclusion. The authors should consider tempering this statement.

      (9) Regarding the mechanism by which CCA-induced proliferation enhances MG reprogramming in the Rbpj knockout background, one plausible explanation is that chromatin states (e.g., histone modifications and DNA methylation) are transiently reset during DNA replication and cell division. While this alone may be insufficient to activate neurogenic programs, it could synergize with Rbpj deletion to allow neurogenic transcription factors (such as Ascl1, Otx2, NeuroD1, and NeuroD2) to access previously inaccessible chromatin regions, thereby promoting MG reprogramming.

    3. Reviewer #2 (Public review):

      Summary:

      The inability of the mammalian retina to regenerate poses a major clinical challenge. Much has been learned about the regenerative potential of the retina from teleost fish, where Müller glia (MG) are able to proliferate and produce new neurons after injury. However, MG do not retain this potential in the mammalian retina. The authors showed previously that forcing MG to re-enter the cell cycle by downregulating p27 and upregulating cyclin D1 could induce MG to dedifferentiate, but the results were transient, and these cells eventually reverted back to MG and did not form neurons. Here, they expand on this to show that in MG, coupling forced cell cycle re-entry with deletion of Rbpj, which inhibits the transcriptional effects of Notch signaling, induces some MG to proliferate and take on features of multiple cell types, including MG precursor cells, amacrine-like cells, and bipolar-like cells. This work lends valuable insight into the regenerative potential of mammalian MG, particularly when Notch signaling is manipulated.

      Strengths:

      The major claims of the authors are well-supported. They show convincingly - and through multiple methods including immunostaining, single-nucleus RNA sequencing, and in situ hybridization - that coupling notch inhibition with cell cycle reactivation induces the expression of neuronal markers in mammalian MG. The snRNA-seq data are particularly valuable in demonstrating the induction of bipolar-cell subtypes. Edu labeling is effective in demonstrating the induction of proliferation, and the long-term viability of the generated neuron-like cells is intriguing.

      Weaknesses:

      Whether the newly generated neurons are functionally integrated remains unclear, and the effect of the manipulation on the function of the retina was not tested. Imaging data suggests that many of the newly generated neurons persist for months, but often appear mislocalized. It is also not clear if the manipulation of MG affects long-term MG function. Cell death was not evaluated, and although the authors evaluated the long-term effect on tight junctions, this data was not quantified, and further analysis on morphology or function was not done. Control eyes were untreated, not vehicle-injected.

    1. eLife Assessment

      This important study probes the long-standing failure to resolve evolutionary relationships between the classical "spiralian" taxa - i.e., annelids, molluscs, brachiopods, platyhelminths and nemerteans - and provides convincing evidence that the branches leading to them are so short as to be unreliable guides to their relationships. This, in turn, has wide-ranging implications for our understanding of animal body plan evolution and the interpretation of early animal fossils.

    2. Reviewer #1 (Public review):

      Summary:

      This interesting paper probes the problematic relationships between the classical "spiralian" taxa, i.e., annelids, molluscs, brachiopods, platyhelminths and nemerteans, and shows that the branches leading to them are so short as to be unreliable guides to their relationships. This, in turn, has important implications for how we view the origin of the animal phyla.

      Strengths:

      A very careful analysis of a famous old problem with quite significant results. The results seem to be robust and support their conclusions.

      It often passes uncommented that many different trees are published about animal relationships, yet some parts of the tree seem extremely difficult to resolve; the spiralians are perhaps the most difficult case. More recently, problems about sponges or ctenophores as sister groups to the rest of the animals have alerted us to major areas of uncertainty in large-scale phylogenetic reconstruction; this paper is a welcome reminder that other, perhaps even harder, problems exist which may be difficult to ever resolve with the (molecular) data we have.

      Weaknesses:

      The paper could have perhaps drawn out some of the implications of its results in a clearer manner.

    3. Reviewer #2 (Public review):

      Summary:

      The relationships among the phyla making up Spiralia - a major clade of animals including molluscs, annelids, flatworms, nemerteans and brachiopods - have been challenging from a phylogenomic perspective despite decades of molecular phylogenetic effort. Every topology uniting subsets of these phyla has been recovered with apparent support in at least one study, yet no consensus has emerged even from large-scale genomic datasets. Serra Silva and Telford set out to determine whether this instability reflects a genuine biological signal being obscured by analytical limitations, or whether it reflects a rapid, near-simultaneous origin of these phyla that has left behind in modern genomes far too little phylogenetic information to resolve. They focused deliberately on five phyla, reducing the problem to a tractable set of 15 unrooted and 105 rooted topologies, and applied a suite of complementary approaches across two independent datasets and multiple substitution models to test whether any topology is significantly preferred over alternatives.

      Strengths:

      (1) The conceptual framing of the problem is excellent, and the study makes a convincing case across several lines of evidence. By enumerating all possible topologies and demonstrating empirically that every one of the 15 unrooted arrangements has been recovered as the preferred solution in at least one published study, the authors make a strong argument about the state of the field. The use of two entirely independent datasets as a consistency check is great, and convergence between them, where it occur,s substantially strengthens confidence in the conclusions.

      (2) It is my view that the simulation framework is a particular strength. Generating data on a fully unresolved star tree and scoring those data under both correctly-specified and misspecified substitution models provides convincing evidence that the strong preference for rooting Spiralia on the flatworm branch is, at least partly, an analytical artefact driven by the exceptionally long branch in combination with compositional heterogeneity across sites. This is an important methodological demonstration with implications beyond spiralian phylogenetics, as the same issue is likely to affect other deep, long-branched lineages in the animal tree of life.

      (3) The randomised taxon-jackknifing approach is a very nice addition here. The demonstration that preferred topologies shift depending on which species happen to be sampled (even within the same phylum) is a convincing indicator of weak signal, and provides a practical caution for future studies that may report strong support for a particular spiralian arrangement based on a fixed taxon sample.

      (4) The branch-length analyses, benchmarking internal interphylum branches against the already disputed and extremely short branch uniting deuterostomes (work also by this group), are well-conceived and solid.

      (5) I think it is worth highlighting the notable intellectual honesty throughout the paper: the authors do not overstate their results, correctly acknowledging that while the unrooted topology grouping molluscs with brachiopods and flatworms with nemerteans emerges most consistently, this preference is not statistically significant under more adequate substitution models and may itself carry some artefactual component.

      Weaknesses:

      (1) The restriction to five phyla is the most significant limitation, as the authors acknowledge this and give a clear computational justification, but readers should be aware that the paper's convincing conclusions apply specifically to the five focal phyla and the evidence remains incomplete with respect to spiralian phylogeny as a whole.

      (2) The treatment of substitution model adequacy, while commendably thorough for site-heterogeneous models, is necessarily bounded. The authors note that models accounting for non-stationarity, across-lineage compositional heterogeneity, or mixtures of tree histories might yield different results, and that even the most sophisticated currently available approaches have not produced consistent spiralian topologies across studies. This is not a criticism of what has been done here - the analytical scope is reasonable and well-implemented - but it means the paper cannot be read as a definitive demonstration that no model will ever resolve these relationships. The distinction between a true hard polytomy and a radiation that is effectively unresolvable given current data and methods could be drawn more sharply in the discussion.

      (3) The reticulation-aware coalescent analyses are presented somewhat briefly relative to the likelihood-based topology scoring. The finding that flatworms are recovered within a paraphyletic jaw-bearing animal clade in both summary trees - interpreted as long-branch attraction - is striking, and its implications for gene-tree-based approaches to spiralian rooting deserve more discussion than they currently receive.

      (4) The central conclusions - that interphylum branches in Spiralia are extraordinarily short, that topological preferences are strongly model-dependent and taxon-sampling-sensitive, and that an ancient rapid radiation is the most parsimonious explanation - are convincingly supported by the evidence presented. The identification of flatworm long-branch attraction as an important confounding factor in rooting analyses is itself an important and well-demonstrated result.

      Conclusion:

      This paper clearly makes an important contribution to the ongoing debate about spiralian relationships and, more broadly, to methodological discussions about how to handle anciently diversified clades where phylogenetic signal is genuinely limited. The exhaustive topology-scoring framework combined with taxon-jackknifing and simulation under unresolved trees is a valuable methodological template that could usefully be applied to other notoriously difficult nodes in the animal tree. I thoroughly enjoyed the discussion of the implications of these findings for interpreting Cambrian fossils and the evolutionary history of shells, segmentation, larval types and other characters - it is both thoughtful and thought-provoking and will be of broad interest well beyond the phylogenomics and zoology communities. From a very practical perspective, the data and scripts provided make the work useful to researchers wishing to apply similar approaches to other groups.

    4. Reviewer #3 (Public review):

      Summary:

      This paper addresses the controversial internal relationships within the Spiralia, a major clade of invertebrate animals including molluscs, annelids, brachiopods and flatworms.

      Strengths:

      Performs a range of empirical analyses and simulations that address the core question. Although a favoured unrooted topology finds some support, this is not strongly endorsed in the paper.

      Weaknesses:

      (1) Only considers a subset of relevant phyla (e.g. gastrotrichs are relevant to the phylogenetic position of Platyhelminthes), although how this would change the scale of the analyses (i.e. number of topologies) is addressed in the paper.

      (2) Discussion of Spiralia evolution and broader context, particularly the relevance for the fossil record. Line 448: our current understanding of the early spiralian fossil record is quite consistent with the main results of this paper. For example, there are very few claims for fossils that sit on the short branch leading to Spiralia (or Lophotrochozoa as defined here) that this paper discusses. Many of the key fossils that inform on the characters discussed in the introduction, which have unusual character combinations, have an apomorphy of one of the phyla discussed, and so are resolved as members of the stem lineages of particular phyla.

      (3) This is what you would expect with long phylum stem lineages (line 148) and a short spiralia stem lineage. For example, the mollusc Wiwaxia has chaetae, but a mollusc like Radula (Smith 2012), the conchiferan mollusc Pelagiella has chaetae and a coiled shell (Thomas et al. 2020). The only fossil groups that are routinely discussed as belonging to the stem lineage of more than one phylum are the tommotiids, which have chaetae, segmentation and a complex mineralised skeleton (but not shells in the brachiopod/mollusc sense, see Guo et al 2023) but they sit on the lophophorate stem lineage, a synapomorphy rich group the monophyly of which the present paper endorses (e.g. line 435). The fossil record is consistent with the scenario presented in line 442, e.g. convergent loss or reduction of chaetae and segmentation and convergent evolution of shells in molluscs and brachiopods.

    1. eLife Assessment

      The authors provide valuable findings showing that GM-CSF prevents the loss of ILC3 populations during gut inflammation and inhibits pro-inflammatory cytokine production. They combine a preclinical model of gut inflammation in zebrafish with spatial transcriptomic analysis of samples from Crohn's disease patients. Although the data provided are clear and point to an anti-inflammatory role of GM-CSF, the strength of evidence remains incomplete as no mechanistic insights into GM-CSF regulation of ILCs are provided, and the most significant mechanistic question remains unanswered: what are the signals downstream of GM-CSF that maintain the ILC3 population? This work will be of interest to immunologists.

    2. Reviewer #1 (Public review):

      This study integrates Xenium spatial transcriptomics of paired inflamed and uninvolved Crohn's disease tissues with functional analyses in a csf2rb-/- larval zebrafish DSS intestinal injury model to investigate the spatial and cell-type-specific roles of GM-CSF. The work is limited mechanistically and adds little to an already disputed field: GM-CSF's role in intestinal inflammation is context-dependent and extensively studied in mice and humans, and this study does not resolve these controversies. The zebrafish appears to be a poor model for these questions: it lacks mammalian intestinal architecture, complex microbiota, and clearly validated functional ILC populations. Putative ILC1s are poorly defined based on stress-response gene modules, while ILC3s are somewhat better characterized, but overall, the system does not allow mechanistic insights into GM-CSF regulation of ILCs. The DSS experiments largely recapitulate the known protective effects of GM-CSF in epithelial injury without clarifying underlying mechanisms.

      Figure 1

      GM-CSF expression is extremely sparse, rarely exceeding 0.005 frequency even in inflamed regions. The authors should acknowledge this and discuss why. Xenium could be used to characterize the niche around GM-CSF-producing cells, but no new cellular circuit is revealed beyond known myeloid-lymphoid interactions.

      Figure 2

      Colon length in DSS colitis is not decreased in Csf2rb⁻/⁻ versus wild-type zebrafish under untreated conditions, suggesting endogenous GM-CSF has minimal impact. In Figure 2E, Tg(mpeg1:mCherry) larvae show staining in vessel- or epithelial-like structures expressing Csf2rb, which does not resemble macrophages and requires clarification. pSTAT5 is upregulated with GM-CSF treatment, but the responding cell types are unclear.

      Figure 3

      Putative ILC1s are defined by stress-response gene modules rather than canonical markers. Overlapping genes with human (HSP90AA1, UBB, MCL1, DOK2) do not indicate ILC1 identity, which is described by IL7R, KLRB1, or TBX21 expression in the human Xenium dataset. ILC2s were not detected, and Ifng expression is broadly distributed, making attribution to ILC1s uncertain. ILC3s are somewhat better defined, but overall, the data do not support mechanistic conclusions about GM-CSF regulation of ILC populations.

    3. Reviewer #2 (Public review):

      The authors show that GM-CSF prevents the loss of ILC3 populations and inhibits pro-inflammatory cytokine production during gut inflammation. They combine a preclinical model of gut inflammation in zebrafish with spatial transcriptomic analysis of samples from Crohn's disease patients. The data show that GM-CSF ameliorates gut inflammation by (1) curtailing the differentiation of disease-associated ILC1 and (2) by "boosting" the tissue repair function of ILC3.

      The topic of the manuscript is interesting. However, there are various limitations that are summarized below.

      (1) The main finding of the manuscript, that GM-CSF maintains ILC3 populations, is not analyzed in depth. Since the authors' own data and other publications show that the receptors for GM-CSF are expressed in myeloid cells, a better analysis of the transcriptional changes of these populations upon GM-CSF administration is needed.

      (2) The authors could compare the transcriptome of macrophages and monocytes from inflamed and uninvolved sections in their Xenium dataset. In addition, investigating how zebrafish macrophages change due to the lack of GM-CSF and comparing them with the human findings would add to the data.

      (3) Since the authors developed a novel mutation in zebrafish that is predicted to affect myeloid populations, a detailed characterization of the myeloid immune compartment in these organisms is missing.

      (4) Niche analysis in the Xenium slides could provide direct evidence on how macrophages close to ILC3 are different from those closer to other cell types, like ILC1.

    4. Author response:

      We thank the editors and reviewers for their careful evaluation of our manuscript, “GM-CSF regulates ILC states and myeloid cell signaling during ulceration in Crohn’s disease.” We appreciate the constructive feedback and agree that strengthening the mechanistic understanding of GM-CSF signaling in the regulation of ILC populations will significantly improve the study.

      The reviewers identified a key gap regarding the downstream mechanisms by which GM-CSF maintains ILC3 populations and limits ILC1 expansion. In response, we will focus our revision on defining the myeloid-mediated pathways downstream of GM-CSF that regulate ILC states.

      Specifically, we plan to: 

      (1) Characterize myeloid cell responses to GM-CSF signaling

      We will perform additional analyses of both our Xenium spatial transcriptomics and zebrafish single-cell RNA-seq datasets to identify transcriptional changes in macrophages and monocytes associated with GM-CSF signaling. This will include differential gene expression and pathway enrichment analyses to uncover candidate signaling pathways (e.g., cytokine and STAT5-associated programs) that may mediate ILC regulation.

      (2) Strengthen spatial niche analysis in human tissue

      We will refine our Xenium-based analyses to better define the cellular microenvironments surrounding GM-CSF-producing cells, including higher-resolution visualization and quantification of receptor-expressing target cells and signaling niches within ulcerated regions.

      (3) Further define immune cell populations in the zebrafish model

      We will enhance the definition of ILC subsets by incorporating additional marker-based analyses and clarifying their relationship to human ILC populations. In parallel, we will more thoroughly characterize the myeloid compartment in csf2rb-deficient zebrafish to determine how GM-CSF signaling impacts these populations.

      (4) Clarify analysis methods and presentation

      We will address all points related to statistical testing, data visualization, and figure clarity raised by the reviewers, including the use of appropriate statistical comparisons for multi-group analyses and improved annotation of gene modules and data sources.

      Together, these revisions will provide a clearer mechanistic framework linking GM-CSF signaling in myeloid cells to the maintenance of ILC3 populations and suppression of inflammatory ILC1 responses.

      We believe these additions will substantially strengthen the manuscript and address the reviewers’ concerns. We appreciate the opportunity to revise our work and look forward to submitting a revised version.

    1. eLife Assessment

      This manuscript has convincing data that provides a high-resolution structure of the Egl-RNA complex. The findings are important to understand the formation, stability, and interactions of this complex. However, the manuscript could be improved by conducting a rigorous statistical analysis, a deeper understanding of apparent discrepancies in the stoichiometric Egl-to-RNA ratio, and exploring the specificity of this complex using a more diverse set of control RNAs.

    2. Reviewer #1 (Public review):

      Summary:

      The authors sought to define the molecular mechanism by which the adaptor protein Egalitarian (Egl) recognizes and binds specific mRNA localization signals -- in particular, the K10 transport and localization signal (TLS) -- to initiate dynein-based transport in Drosophila. In doing so, they identified the minimal Egl domains required for RNA binding, determined the atomistic structure of the Egl-RNA complex, and explored the recognition mechanism (shape vs. structure). They furthermore performed in vivo functional validation using CRISPR-mediated genome editing in Drosophila that showed that the identified binding residues are biologically essential.

      Strengths:

      The authors provided a detailed crystal structure of the Egl-RNA complex at high resolution. In particular, they used a MBP-fusion crystallization driver to be able to resolve the flexible C-terminal domain of Egl (EHD). The authors' use of an integrative approach combining X-ray crystallography with binding assays and in vivo functional validation provides compelling evidence for their claims.

      The work provides a detailed interaction mapping that identifies the protein residues responsible for the electrostatic interaction with the RNA. In doing so, the work explains how Egl can recognize diverse RNA sequences by demonstrating that Egl binds primarily to the phosphate backbone and specific structural bulges, providing a plausible model for how one protein can recognize many different localization signals that share little sequence similarity.

      Weaknesses:

      Discrepancy in the stoichiometric Egl-to-RNA ratio (the structural data in the paper indicate a 1:1 ratio, whereas previous single-molecule transport studies suggest a 2:1 ratio) remains unanswered, with the likely explanation that the truncated version of the protein might not capture the full (native) assembly. While the authors acknowledge this in the Discussion, the paper would benefit from this issue being raised earlier, already in the Results section. Moreover, there is a notable omission of a recent preprint on a very similar topic [https://www.biorxiv.org/content/10.1101/2025.08.02.668268v1.full].

      In vitro, Egl shows a relatively high affinity for non-target RNAs such as the MS2 loop, whereas it is highly selective in vivo. Is it possible that other cofactors are required for the high-fidelity sorting not present in the study? Testing binding in the presence of co-factors (BicD or Dlc) could indicate whether they increase the specificity for target RNAs over non-target ones.

      Including a more diverse set of size-matched RNA controls would have significantly strengthened the paper's claims regarding specificity. Using RNAs that mimic K10 TLS would have provided a more rigorous test of the shape-recognition by Egl - using, for instance, decoy RNAs of the same length but with differently positioned bulges (or no bulges at all) or testing other known localization signals (like bicoid or hairy) of similar length.

      Appraisal of aims:

      The authors successfully determined the crystal structure of the Egl-RNA complex, identifying a modular binding surface composed of the EXO domain, a helical linker, and the EHD. They effectively demonstrated that Egl uses a combination of shape-specific recognition (targeting RNA bulges) and sequence-specific interactions (bonding with specific bases), and confirmed the biological necessity of these findings by showing that mutating the identified residues in living flies leads to infertility and oocyte differentiation defects. These results provide robust evidence for the authors' claims that they have defined a minimal RNA localization signal. In particular, the correlation between the L-Triple mutation's binding defect and its total sterility in flies provides proof that the identified binding surface is the functional one. While the 1:1 stoichiometry remains a point for further investigation, the authors transparently address that full-length transport may require a 2:1 assembly, suggesting their structure represents the fundamental building block of that larger complex.

      Impact of the work on the field:

      This study provides a high-resolution picture of how a dynein adaptor recognizes its cargo. It moves the field from predictive models to atomic-level certainty, setting a benchmark for studying other similar transport complexes. By proving that Egl recognizes RNA shape (bulges) as much as sequence, the work changes the outlook on the search for localization signals in other genomes, moving beyond simple sequence motifs to 3D structural signatures. The coordinates deposited in the EBI (IDs: 9UJU, 9UJY, 9UUG) provide a resource for the modelling of higher-order transport complexes. The identification of specific residues (e.g., the L-Triple) provides the community with tools to disrupt RNA transport in Drosophila without destroying the entire protein, allowing for more nuanced studies of development.

    3. Reviewer #2 (Public review):

      Summary:

      Hong et. al. aimed to elucidate the structural basis of the Egalitarian recognition of the K10 mRNA. Using X-ray crystallography and several biochemical, biophysical, and cellular techniques, they were able to shed light on the formation, stability, and basis of interaction of the complex. The authors successfully accomplished their goal.

      Strengths:

      The experiments are well-performed and convincing. The manuscript is well-written.

      Weaknesses:

      (1) Some statistical analysis would improve the manuscript. In particular, the manuscript has several results that are based on comparisons, such as Kd. Adding p-values for significance is recommended, and this would improve the treatment of data.

      (2) When showing interactions (dotted lines) in structural figures, adding the distance would be useful and is recommended.

      (3) Additional SI Figure. It would enrich the manuscript to have the composite simulated annealing-omit 2|Fo| - |Fc | electron density maps for the structures contoured at a given sigma, superimposed on the final refined model. This would represent how well the data fits into the model.

    4. Author response:

      We would like to thank the editors and the reviewers for their thoughtful and constructive assessment of our manuscript. We appreciate the reviewers' positive recognition of our research and their thoughtful assessment of our data.

      In the upcoming revision, we will incorporate rigorous statistical analysis (p-values) for our binding assays, optimize the structural figures and summary tables for better clarity, and discuss the recent preprint paper alongside the nuances of Egl-BicD stoichiometry. Regarding the suggestion for CLIP-seq, we agree that a global analysis would be a valuable extension of this work. However, as our lab’s core expertise is in structural biology, and the in vivo functional studies in this manuscript were conducted through a collaboration to validate our structural findings, we feel that such a large-scale genomic study falls beyond the scope of the current structural report.

    1. eLife Assessment

      This valuable study provides evidence that the integration of the nuclear envelope into the endoplasmic reticulum provides a mechanism for mechanical integration across this continuous membrane system. This work opens up new avenues for studying organelle membrane tension homeostasis. The evidence was found to be convincing and carefully quantified, with minor limitations that we expect to be further explored in future work.

    2. Reviewer #1 (Public review):

      Summary:

      Zare‑Eelanjegh et al. investigate how the endoplasmic reticulum, the nucleus, and the cell periphery are mechanically linked by indenting intact cells with specially shaped atomic‑force probes that double as drug injection devices. Fluorescence‑lifetime imaging of the membrane tension reporter Flipper‑TR reveals that these three compartments are mechanically linked and that the actin cytoskeleton, microtubules, and lamins modulate this coupling in complex ways.

      Strengths:

      * The study makes an important advance by applying FluidFM to probe organelle mechanics in living cells, a technically demanding but powerful approach.

      * Experimental design is quantitative, the data are clearly presented, and the conclusions are broadly consistent with the measurements.

      Weaknesses:

      * Calcium‑dependent effects: Indentation can evoke cytoplasmic Ca²⁺ elevations that drive myosin contraction and reshape the internal membrane network (e.g., vesiculation: PMID : 9200614, 32179693) possibly confounding the Flipper-TR responses; without simultaneous/matching Ca²⁺ imaging, cell viability assays (e.g., Sytox), and intracellular Ca²⁺ sequestration or myosin inhibition experiments, a more complex mechanochemical coupling cannot be excluded, weakening conclusions.

      * Baseline measurements: Flipper‑TR lifetime images acquired without indentation do not exclude potential light‑induced or time‑dependent changes, which weakens the conclusions.

      * Indentation depth versus nuclear stiffness/tension: Because lamin‑A/C depletion softens nuclei, a given force may produce a deeper pit and thus greater membrane stretch. It is unclear how the cytoskeletal perturbations affect indentation depth, which weakens the conclusions.

      Comments on revisions:

      With their responses, the authors have relieved my initial concerns.

    3. Reviewer #2 (Public review):

      Summary

      This valuable study combines atomic force microscopy with genetic manipulations of the lamin meshwork and microinjection of cytoskeletal depolymerizing drugs to probe the mechanical responses of intracellular organelles to combinations of cytoskeletal perturbations. This study demonstrates both local and distal responses of intracellular organelles to mechanical forces, and shows that these responses are affected by disruption of the actin, microtubule, and lamin cytoskeletal systems.

      Strengths:

      This study uses a sensitive micromanipulation system to apply and visualize the effects of force on intracellular organelles.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Zare-Eelanjegh et al. investigate how the endoplasmic reticulum, the nucleus, and the cell periphery are mechanically linked by indenting intact cells with specially shaped atomic force probes that double as drug injection devices. -Fluorescencelifetime imaging of the membrane tension reporter -FlipperTR- reveals that these three compartments are mechanically linked and that the actin cytoskeleton, microtubules, and lamins modulate this coupling in complex ways.

      Strengths:

      (1) The study makes an important advance by applying FluidFM to probe organelle mechanics in living cells, a technically demanding but powerful approach.

      (2) Experimental design is quantitative, the data are clearly presented, and the conclusions are broadly consistent with the measurements.

      Weaknesses:

      (1) Calcium-dependent- effects: Indentation can evoke cytoplasmic CA<sup>2+</sup> elevations that drive myosin contraction and reshape the internal membrane network (e.g., vesiculation: PMID : 9200614, 32179693) possibly confounding the Flipper-TR responses; without simultaneous/matching CA<sup>2+</sup> imaging, cell viability assays (e.g., Sytox), and intracellular CA<sup>2+</sup> sequestration or myosin inhibition experiments, a more complex mechanochemical coupling cannot be excluded, weakening conclusions.

      (2) Baseline measurements: FlipperTR lifetime images acquired without indentation do not exclude potential -light-induced or -time-dependent- changes, which weaken the conclusions.

      (3) Indentation depth versus nuclear stiffness/tension: Because lamin-A/C depletion softens nuclei, a given force may produce a deeper pit and thus greater membrane stretch. It is unclear how the cytoskeletal perturbations affect indentation depth, which weakens the conclusions.

      Reviewer #2 (Public review):

      Summary:

      This useful study combines atomic force microscopy with genetic manipulations of the lamin meshwork and microinjection of cytoskeletal depolymerizing drugs to probe the mechanical responses of intracellular organelles to combinations of cytoskeletal perturbations. This study demonstrates both local and distal responses of intracellular organelles to mechanical forces and shows that these responses are affected by disruption of the actin, microtubule, and lamin cytoskeletal systems. Interpretation of these effects is limited by the absence of key data determining whether acute microinjection of cytoskeleton-depolymerizing drugs has complete or partial effects on the targeted cytoskeletal networks.

      Strengths:

      This study uses a sensitive micromanipulation system to apply and visualize the effects of force on intracellular organelles.

      Weaknesses:

      The choice to deliver cytoskeleton-depolymerizing drugs by local microinjection is unusual, and it is unclear to what extent actin and microtubule filaments are actually depolymerized immediately after microinjection and on the minutes-length timescale being evaluated in this study. This omission limits the interpretation of these data.

      Reviewer #3 (Public review):

      Summary:

      Using an approach developed by the authors (FluidFM) combined with FLIM, they discover that a mechanical force applied over the cell nucleus triggers mechanical responses dependent on the Lamina composition.

      Strengths:

      The authors present a new approach to study mechano-transduction in living cells, with which they uncover lamin-dependent properties of the nucleus.

      Weaknesses:

      (1) The transfer of the mechanical response from the Lamina to the ER is not fully covered.

      (2) In Figure 4D, WT dots are the same for each compartment. Why do the authors not make one graph for each compartment with WT, A-KO, B-KD, and A-KO/B-KD together?

      (3) In Figure 1E, the authors showed well how the probe deforms the nucleus. It is not indicated in the material and methods section or in the figure legend, where, in Z, the acquisition of FLIM images was made or if it is a maximum projection. I assume it was made at a plane in the middle of the nucleus to see the nuclear envelope border and the ER at the same time. Did the authors look at the nuclear membrane facing upward, where most of the deformation should occur? Are there more lifetime changes? In Figure D, before injection of CytoD, we can clearly see a difference at the pyramidal indentation site with two different lifetime colors.

      (4) A great result of this article regards the importance of Lamins, A and B, in triggering the response to a mechanical force applied to the nucleus. Could 3D imaging for LaminA and LaminB be performed at the different time points of indentation to see how the lamins meshworks are deformed and how they return to basal state? This could be correlated with the FLIM results described in the article.

      (5) Lamins form a meshwork underneath the nuclear membrane. They are connected to the cytoskeletons mainly by the LINC complex. Results presented here show that the cytoskeletons are implicated in transferring the stimulus from the nuclear envelope to the ER. Could the author perform the same experiments using Nesprin-2 or/and Nesprin-1 or/and SUN1/2 knockdowns to determine if this transmission is occurring through the LINC complex or rather in a passive way by modifying the nuclear close surroundings?

      (6) The authors used cytoskeleton drugs, CytoD and Nocodazole, with their FluidFM probe, but did not show if the drugs actually worked and to what extent by performing actin or microtubule stainings. In the original paper describing FluidFM, 15s were enough to obtain a full FITC-positive cell after injection. Here, the experiments are around 5 minutes long. I therefore interrogate the rationale behind the injection of the drugs compared to direct incubation, besides affecting only the cell currently under indentation.

      We thank the reviewers for their constructive criticisms and suggestions. Accordingly, we amended the manuscript and the figures.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) Calcium-dependent effects: Indentation can evoke cytoplasmic CA<sup>2+</sup> elevations that drive myosin contraction and reshape the internal membrane network (e.g., vesiculation: PMID : 9200614, 32179693) that may affect Flipper-TR signals independent of membrane tension; without simultaneous CA<sup>2+</sup> imaging, cell viability assays (e.g., Sytox costaining), intracellular CA<sup>2+</sup> sequestration or myosin inhibition, a more complex mechanochemical coupling cannot be excluded. Tracking ER morphology during the experiments with luminal and membrane markers would further clarify this point.

      For the goal of our article which is exhibiting and quantifying the tension propagation and tension homeostasis over different organelles managing the mechanosensitivity and thus the mechanoresponse of cell, the test cells (drug injected cells) were compared with the control group of non-drug injected cells (Fig. 2 and Fig. 3), and in these cases potential overall responses of the cells to intendation, e.g. potential changes in CA<sup>2+</sup> sequestration, are covered by the control group.

      Interestingly, using only cylindrical probes in CytoD injection while indenting cells, demonstrated higher tension at the NE compared to the control group of non-drug injected cells. This indicates that a higher effect arising from the F-actin-disturbance phenomena compared to the indention process itself, at least where the cells were stimulated using cylindrical probes. That was also the reason why in the next steps of this study including varying the indentation site from the nucleus to the ER or cell periphery as well as studying WT cells compared to varied lamina compositions, only cylindrical probes with minimized indention effect on the NE and the ER were used.

      Lastly, to examine simultaneously response to tension changes and calcium dynamics, we have meanwhile extended our study and analyzed cells treated with different cytoskeleton disturbing drugs (e.g., CytoD), subjected to viscoelasticity measurements using AFM indentation (i.e. cells relaxation studies following indentation), and injected with drugs perturbing the regulation of CA<sup>2+</sup> homeostasis (i.e., Thapsigargin), combined with simultaneous CA<sup>2+</sup> imaging, for which another manuscript is in preparation.  

      (2) Baseline measurements: FlipperTR lifetime images acquired without indentation, collected with identical timing and illumination, are needed as controls to gauge potential light-induced or time-dependent changes.

      For every cell a baseline referring to its tension at relaxed state (without indentation) was quantified by a Flipper-TR image taken before the indention and injection processes (“before”). As explained in the manuscript (lines 180-184), this baseline tension value was then used to be subtracted from the tension measured over time by the time-lapse FlipperTR imaging over the course of 3-4 min of stimulation (indentation + injection) as well as immediately or 5 min post-stimulus. The control group (i.e., non-drug injected cells or WT cells where the effect of F-actin depolymerization or the effect of lamina composition were studied, respectively) was always performed in the same manner as for test group. As such, tenson increase due to the light-inducing, time-dependent changes or indentation solely, were excluded.

      (3) Indentation depth versus nuclear stiffness/tension: Because laminA/C depletion softens nuclei, a given force may arguably produce a deeper pit and thus greater (not less) membrane stretch. Demonstrating that pit geometry depends only on applied force - and not on genetic or pharmacological perturbations - is necessary to rule out alternative interpretations.

      We thank the reviewer for raising this important point regarding the relationship between indentation depth and nuclear stiffness. To address whether pit geometry depends on applied force rather than genetic perturbations, we analyzed the piezo movement required to reach the 150 nN force setpoint across all experimental conditions (WT, LMNA KO, LMNB KD, and LMNA KO/LMNB KD cells).

      Our results (Fig. S6) demonstrate that there is no statistically significant difference in the piezo displacement from the contact point to the 150 nN setpoint between any of the experimental groups (Kruskal-Wallis H-test: H = 1.744, p = 0.627). This indicates that for a constant applied force of 150 nN, the indentation depth is equivalent across all conditions despite differences in nuclear stiffness.

      Therefore, the observed differences in tension response and perhaps the membrane stretch cannot be attributed to variations in indentation depth but rather reflect the intrinsic differences in molecular mechanical response to equivalent mechanical stimuli.

      This has been added in the manuscript in lines 282-286.

      Reviewer #2 (Recommendations for the authors):

      (1) Please clarify the distinctions between the pyramidal and cylindrical probes. The manuscript alludes to sharpening the cylindrical probe to facilitate membrane rupture. Do both probes rupture the plasma membrane upon force application? If so, at what applied force does this occur? It seems that PM rupture would also affect tension on intracellular membranes during and especially after force application.

      Yes, both cylindrical and pyramidal probes are rupturing PM as well as the nuclear membrane when targeting the nucleus of cells. When targeting Hela cells, used for this study, pyramidal probes puncture the membrane at a higher force of 100 nN compared to rupture forces between 10 nN and 50 nN required for sharpened cylindrical probes used here. This was explained in manuscript lines 112-115 for cylindrical probes and revised for pyramidal probes in lines 115-119.

      (2) Also re: probes: it is clear from Figure 1 that the total volume displacement induced by the pyramidal probe is far greater than the cylindrical probe. This greater displaced volume seems to be a very reasonable explanation for the increased membrane tension detected with the pyramidal probe, but this interpretation is not discussed.

      That is a good point, thank you! This has been added in lines 138-140.

      (3) Both cytochalasin D and nocodazole work by preventing new polymerization of monomers, which acutely affects new assembly and, over time, leads to loss of polymerized filaments. On the timescale of the experiments shown, it seems possible that acute effects on new filament assembly may be occurring, but that pre-assembled filaments may remain stable. It may thus be a misinterpretation to describe these conditions as "without actin fibers" or "without MTs". Further complicating matters, it is possible that the kinetics of filament disassembly may be altered by combinatorial treatment and/or in lamin knockout conditions versus wild-type cells. For instance, it has been shown that microtubule depolymerization increases actin contractility (see PMID 33089509). For these reasons, control experiments showing the extent of actin and/or microtubule disassembly in each condition tested are essential to interpret the data shown.

      Thank you for rasing this valid point. This has been corrected and noted as "less actin fibers" and "less MTs". For what concerns the timescale within which the drugs (e.g., CytoD and Nocodazole) affect the filaments assembly, a higher concentration of 50 µM for each of CytoD and Nocodazole leading to final concentration of 0.5 µM was used for intracellular injection. This final physiologically relevant concentration was expected to act as fast as 12 min for CytoD and 1-5 min for Nocodazole when directly delivered inside the cell, excluding the required time for passing the plasma membrane. Especially in our study examining the dynamic response of cells and change in tension is focusing on the early effects of drugs and deviation from the control groups rather than the steady state achieved at longer time points. The basis for the time estimation relies on the reported values in the literature. For instance, a recent comprehensive study quantified actin dynamics and its interaction with CytoD using high resolution images of single actin filaments acquired by total internal reflection fluorescence (TIRF) microscopy and reported a value of approximately 150 s (depicted from the graphs presented in Fig. 2D and 2F) as a starting point of inhibiting actin filaments polymerization after introducing 5 nM CytoD flow in a chamber containing actin filaments.1 Or in another study, a half-time of 40 s for the complete disassembly of microtubules in monocytes has been reported for cells incubated with 1 µM Nocodazole.2 This part was also included in SI file, section “Mechanochemical stimulation”.

      (4) The presentation of some of the data could be clarified. For instance, it is unclear how some time course experiments can be non-significant but the endpoint analysis can be significant (for instance, Figure 3C vs. Figure 3D.)

      We agree that some instances require clearer interpretation: indenting cell nucleus using cylindrical probes induced a higher tension at CytoD-injected cells compared to control cells at both the ER and NE, during and after stimulus (Fig. 2E-F and Fig. 3C-D). Time lapse tension analysis of these cells at the ER and NE showed a close to significant and significant differences between test and control groups, respectively. p-values of 0.087 for Fig. 2E (bottom row, ER) and 0.042 for Fig. 3C (top row, ER) were captured at the ER for the last time point during stimulus. For “after stimulus” condition, significant differences between CytoD-injected and control cells at both the ER and NE were captured. The ER’s complex morphology consists of many curved structures of lumens and disks which can deform when subjected to external mechanical perturbation, making it prone to absorb stress and strain when directly targeted. That could explain the similar tension levels in both CytoD-injected and control cells during ER indentation. Notably, unlike nucleus-targeted cells, ER-targeted cells only show increased tension at the ER and NE in CytoDinjected cells compared to control ones after stimulation. This suggests fundamental differences in the mechanical coupling of the nucleus and the ER to the cytoskeleton. While the nucleus maintains direct, structural actin connections through the nuclear lamina and LINC complexes3, making it immediately sensitive to actin disruption, the ER relies on indirect, signaling-mediated cytoskeletal interactions4,5. Thus, the ER functions as a dynamic tension buffer that engages cytoskeletal support primarily during active repair processes following mechanical perturbation. This explains why nuclear probing reveals immediate tension differences in actin-disrupted cells, while ER probing only shows post-retraction effects. Consequently, statistical analysis detects significant differences between test and control groups after probe removal, but not during probe contact in ER-targeted experiments. This was also explained better in the manuscript in line 236.

      References

      (1) Mitani, T. et al. Microscopic and structural observations of actin filament capping and severing by Cytochalasin D. bioRxiv, 2025.2001.2028.635382 (2025).

      (2) Cassimeris, L. U., Wadsworth, P. & Salmon, E. D. Dynamics of microtubule depolymerization in monocytes. J Cell Biol 102, 2023-2032 (1986).

      (3) Maurer, M. & Lammerding, J. The Driving Force: Nuclear Mechanotransduction in Cellular Function, Fate, and Disease. Annu Rev Biomed Eng 21, 443-468 (2019).

      (4) Shi, X. et al. Actin nucleator formins regulate the tension-buffering function of caveolin-1. J Mol Cell Biol 13, 876-888 (2022).

      (5) van Vliet, A. R. & Agostinis, P. PERK and filamin A in actin cytoskeleton remodeling at ER-plasma membrane contact sites. Molecular & Cellular Oncology 4, e1340105 (2017).

    1. eLife Assessment

      This valuable manuscript describes ATP5I, a subunit of F1Fo-ATP synthase, as a key target of medicinal biguanides. The knockout of ATP5I in pancreatic cancer cells mimics biguanide treatment, inducing a metabolic switch from OXPHOS to glycolysis due to a compromised expression of the Complex I protein NDUFB8. This results in a markedly decreased NAD/NADH ratio and decreased cell proliferation. These solid findings point out ATP5I as a promising mitochondrial target for cancer therapies and contribute to our understanding of metformin's mechanism of action since many of its molecular mechanisms remain poorly understood.

    2. Reviewer #1 (Public review):

      Summary:

      In the manuscript entitled 'The Role of ATP Synthase Subunit e (ATP5I) in 1 Mediating the Metabolic and Antiproliferative 2 Effects of Biguanides', Lefrancois G et al. identifies ATP5I, a subunit of F1Fo-ATP synthase, as a key target of medicinal biguanides. ATP5I stabilizes F1Fo-ATP synthase dimers, essential for cristae morphology, but its role in cancer metabolism is understudied. The research shows ATP5I interacts with a biguanide analogue, and its knockout in pancreatic cancer cells mimics biguanide treatment effects, including altered mitochondria, reduced OXPHOS, and increased glycolysis. ATP5I knockout cells resist biguanide-induced antiproliferative effects, but reintroducing ATP5I restores the effects of metformin and phenformin. These findings highlight ATP5I as a promising mitochondrial target for cancer therapies. The manuscript is well written.

      Strengths:

      Demonstrated the experiments in a systematic and well accepted methods

      Weaknesses:

      Significance of the target molecule and mechanisms may help in understanding the molecular mechanisms of metformin.

      Comments on revisions:

      In the revised manuscript, the authors addressed all the queries.

    3. Reviewer #2 (Public review):

      Summary:

      The mechanism(s) by which the therapeutic drug metformin lowers blood glucose in type 2 diabetes and inhibits cell proliferation at higher concentrations remain contentious. Inhibition of complex 1 of the mitochondrial respiratory chain with consequent changes in cellular metabolites which favour allosteric activation of phosphofructokinase-1, allosteric inhibition of fructose bisphosphatase-1 and cAMP signalling and activation of AMPK which phosphorylates transcription factors are candidate mechanisms. The current manuscript proposes the e-subunit of ATP-synthase as a putative binding protein of biguanides and demonstrates that it regulates the expressivity of the Complex 1 protein NDUFB8.

      Strengths:

      (1) The metformin conjugate and metformin show comparable efficacy on inhibition of cell proliferation in the millimolar range.

      (2) Demonstration of compromised expression of the Complex I protein NDUFB8 by the ATP5I knock out and its reversal by ATP5I expression is an important strength of the study. This shows that the decreased "sensitivity" to metformin in the ATP5I knock out cells could be due to various proteins.

      (3) Demonstration of converse effects of ATP5I KO and re-expression ATP5I on the NAD/NADH ratio.

      Weaknesses:

      (1) The interpretation of the cellular co-localization of the biotin-biguanide conjugate with TOMM20 (Figure 1-D) as mitochondrial "accumulation" of the conjugate is overstated because it cannot exclude binding of the conjugate to the mitochondrial membrane. It would have been more convincing if additional incubations with the biotin-biguanide conjugate in combination with metformin had shown that metformin is competitive with the biotin-conjugate.

      (2) The manuscript reports the identification of 69 proteins by mass spectrometry of the pull-down assay of which 31 proteins were eluted by metformin. However, no Mass Spectrometry data is presented of the peptides identified. The methodology does not state the minimum number of peptides (1, 2?) that were used for the identification of the 31/69 proteins.

      (3) The validation of ATP5I was based on the use of recombinant protein (which was 90% pure) for the SPR and use of a single antibody to ATP5I. The validity of the immunoblotting rests on the assumption that there is no "non-specific" immunoactivity in the relevant mol wt range. Information on the validation of the antibody would be helpful.

      (4) Knock-out of ATP5I markedly compromised the NAD/NADH ratio (Fig.3A) and cell proliferation (Fig.3D). These effects may be associated with decreased mitochondrial membrane potential which could explain the low efficacy for metformin (and most of the data in Figs 3-5). This possibility should be discussed. Effects of [metformin] on the NAD/NADH ratio in control cells and ATP5I-KO would have been helpful because the metformin data on cell growth is normalized as fold change relative to control, whereas the NAD/NADH ratio would represent a direct absolute measurement enabling comparison of the absolute effect in control cells with ATP5I KO.

      (5) Figure-6 CRISPR/Cas9 KO at 16mM metformin in comparison with 70nM rotenone and 2 micromolar oligomycin (in serum containing medium). The rationale for use of such a high concentration of metformin has not been explained. In liver cells metformin concentrations above 1mM cause severe ATP depletion, whereas therapeutic (micromolar) concentrations have minimal effects on cellular ATP status. The 16mM concentration is ~2 orders of magnitude higher than therapeutic concentrations and likely linked to compromised energy status. The stronger inhibition of cell proliferation by 16mM metformin compared with rotenone or oligomycin raises the issue whether the changes in gene expression may be linked to the greater inhibition of mitochondrial metabolism. Validation of the cellular ATP status and NAD/NADH with metformin as compared with the two inhibitors could help the interpretation of this data.

      Comments on revisions:

      No further comments.

    4. Reviewer #3 (Public review):

      Most of the data are based on measurements of the oxygen consumption rate (OCR) and extracellular acidification rate (ECAR) measured by the Seahorse analyser in control and ATP5l KO cells. However, these measurements are conducted by a single injection of a biguanide, followed over time and presented as fold change. By doing so, the individual information of the effect to of metformin and derivate on control and KO cells are lost. In addition, the usual measurement of OCR is coupled with certain inhibitors and uncouplers, such as oligomycin, FCCP and Antimycin A/rotenone, to understand the contribution of individual complexes to the respiration. Since biguanides and ATP5l KO affect protein levels of components of complex I and IV, it would be informative to measure their individual contributions/effects in the Seahorse. To further strengthen the data, it would be helpful to obtain measurements of actual ATP levels in these cells, as this would explain the activation of AMPK.

      The authors report on alterations in mitochondrial morphology upon ATP5l KO, which is measured by subjective quantifications of filamentous versus puncta structures. Fiji offers great tools to quantify the mitochondrial network unbiased and with more accuracy using deconvolution and skeletonization of the mitochondria, providing the opportunity to measure length, shape and number quantitatively. This will help to understand better, whether mitochondria are really fragmented upon ATP5l KO and rescued by its re-introduction.

      Finally, the authors report in the last part of the paper a genetic CRISPR/Cas9 KO screen in NALM-6 cells cultured with high amounts of metformin to identify potential new mediators of metformin action. It is difficult to connect that to the rest of the paper, because a) different concentrations of metformin are used and b) the metabolic effects on energy consumption are not defined. They argue about molecular function of the obtained hits based on literature, and on comparison the pattern of genetic alterations based on treatments with known inhibitors such as oligomycin and rotenone. However, a direct connection is not provided, thus the interpretation at the end of the results that "the OMA1-DEL1-HRI pathway mediates the antiproliferative activity of both biguanides and the F1ATPase inhibitor oligomycin" while increasing glycolysis, needs to be tuned down. This is an interesting observation, but no causality is provided. In general, this part stands alone and needs to be better connected to the rest of the paper.

      Comments on revisions:

      Thanks to the authors for addressing the concerns raised during the review of the original manuscript. The data now include proper measurements of OCR and quantifications of the mitochondria network. The screening data is better connected to the rest of the paper and provide compelling evidence for mitochondria and in particular the ATP synthase as potential targets of metformin.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In the manuscript entitled 'The Role of ATP Synthase Subunit e (ATP5I) in 1 Mediating the Metabolic and Antiproliferative 2 Effects of Biguanides', Lefrancois G et al. identifies ATP5I, a subunit of F1Fo-ATP synthase, as a key target of medicinal biguanides. ATP5I stabilizes F1Fo-ATP synthase dimers, essential for cristae morphology, but its role in cancer metabolism is understudied. The research shows ATP5I interacts with a biguanide analogue, and its knockout in pancreatic cancer cells mimics biguanide treatment effects, including altered mitochondria, reduced OXPHOS, and increased glycolysis. ATP5I knockout cells resist biguanide-induced antiproliferative effects, but reintroducing ATP5I restores the effects of metformin and phenformin. These findings highlight ATP5I as a promising mitochondrial target for cancer therapies. The manuscript is well written.

      Strengths:

      Demonstrated the experiments in systematic and well-accepted methods.

      Weaknesses:

      The significance of the target molecule and mechanisms may help in understanding the molecular mechanisms of metformin.

      We greatly appreciate the reviewer’s insightful comment regarding the importance of the target molecule and its mechanisms in elucidating metformin’s molecular actions. ATP5I plays a key role in the dimerization and assembly of the F1F0-ATP synthase complex. To address this, we performed Blue Native-PAGE followed by western blotting using an antibody against the β-subunit of the F1 domain. Our results show that metformin affects the oligomeric state of the F1F0-ATP synthase in a way that partially reproduces the effect of the KO of ATP5I (Fig 2G). This provides direct evidence that metformin acts on-target through ATP5I.

      Reviewer #2 (Public review):

      Summary:

      The mechanism(s) by which the therapeutic drug metformin lowers blood glucose in type 2 diabetes and inhibits cell proliferation at higher concentrations remain contentious. Inhibition of complex 1 of the mitochondrial respiratory chain with consequent changes in cellular metabolites which favour allosteric activation of phosphofructokinase-1, allosteric inhibition of fructose bisphosphatase-1 and cAMP signalling and activation of AMPK which phosphorylates transcription factors are candidate mechanisms. The current manuscript proposes the e-subunit of ATP-synthase as a putative binding protein of biguanides and demonstrates that it regulates the expressivity of the Complex 1 protein NDUFB8.

      Strengths:

      (1) The metformin conjugate and metformin show comparable efficacy on inhibition of cell proliferation in the millimolar range.

      (2) Demonstration of compromised expression of the Complex I protein NDUFB8 by the ATP5I knockout and its reversal by ATP5I expression is an important strength of the study. This shows that the decreased "sensitivity" to metformin in the ATP5I knock-out cells could be due to various proteins.

      (3) Demonstration of converse effects of ATP5I KO and re-expression ATP5I on the NAD/NADH ratio.

      Weaknesses:

      (1) The interpretation of the cellular co-localization of the biotin-biguanide conjugate with TOMM20 (Figure 1-D) as mitochondrial "accumulation" of the conjugate is overstated because it cannot exclude binding of the conjugate to the mitochondrial membrane. It would have been more convincing if additional incubations with the biotin-biguanide conjugate in combination with metformin had shown that metformin is competitive with the biotin-conjugate.

      We appreciate the reviewer’s comment and agree that the resolution provided by fluorescence microscopy makes it challenging to pinpoint the specific mitochondrial compartment where the biotin-biguanide conjugate localizes, even with additional markers such as TOMM20 antibodies for the inner mitochondrial membrane. While it remains a possibility that the conjugate binds to the mitochondrial surface, another plausible explanation is that the biotin moiety may facilitate entry into mitochondria through a biotin-specific transporter, adding further mechanistic intricacies. Furthermore, while a competition assay with metformin might help investigate interactions with mitochondrial targets and transporters (OCT family), it would not compete for biotin-mediated transport. Thus, while we acknowledge the reviewer’s suggestion, we believe such an experiment may not provide conclusive evidence regarding the conjugate’s mitochondrial localization or mechanism of entry. Instead, we revised the manuscript to more accurately describe the findings as "mitochondrial association" rather than "mitochondrial accumulation," ensuring that our interpretation remains consistent with the resolution and limitations of the data presented.

      (2) The manuscript reports the identification of 69 proteins by mass spectrometry of the pull-down assay of which 30 proteins were eluted by metformin. However, no Mass Spectrometry data is presented of the peptides identified. The methodology does not state the minimum number of peptides (1, 2?) that were used for the identification of the 31/69 proteins.

      We added a comprehensive table summarizing these findings (Figure 1- figure supplement 2). We considered all peptides and decided to perform stringent validation tests for those chosen to be further studied.

      (3) The validation of ATP5I was based on the use of recombinant protein (which was 90% pure) for the SPR and the use of a single antibody to ATP5I. The validity of the immunoblotting rests on the assumption that there is no "non-specific" immunoactivity in the relevant mol wt range. Information on the validation of the antibody would be helpful.

      Regarding the recombinant protein used for SPR, its purity was evaluated using a Coomassie-stained gel. For the antibody used in immunoblotting, its specificity was validated through knockout cell lines (Figure 2A), ensuring minimal concerns about non-specific immunoactivity within the relevant molecular weight range. Unfortunately, the KO data comes in the paper after the first immunoblots are presented. We outlined this validation in the methods section.

      (4) Knock-out of ATP5I markedly compromised the NAD/NADH ratio (Fig.3A) and cell proliferation (Figure 3D). These effects may be associated with decreased mitochondrial membrane potential which could explain the low efficacy of metformin (and most of the data in Figures 3-5). This possibility should be discussed. Effects of [metformin] on the NAD/NADH ratio in control cells and ATP5I-KO would have been helpful because the metformin data on cell growth is normalized as fold change relative to control, whereas the NAD/NADH ratio would represent a direct absolute measurement enabling comparison of the absolute effect in control cells with ATP5I KO.

      The mitochondrial membrane potential depends on a functional electron transport chain which drives proton pumping from the matrix to the intermembrane space. Metformin can decrease the mitochondrial membrane potential and this is usually explained as a consequence of complex I inhibition [1]. It has been published that metformin requires this membrane potential to accumulate in mitochondria so the actions of metformin are self-limiting due to this requirement. The reviewer is right that ATP5I KO cells could be resistant to metformin because they may have a lower membrane potential. We do not believe this to be the case because the response to phenformin, another biguanide that can enter mitochondria through the membrane without the need of the OCT transporters [2], is also affected in ATP5I KO cells. Of note, compensatory mechanisms such as enhanced glycolysis, as observed in ATP5I KO cells (elevated ECAR and increased sensitivity to 2-D-deoxyglucose), and the ATPase activity of F<sub>1</sub>F<sub>0</sub>-ATP synthase could potentially help maintain membrane potential suggesting that this might not be an issue in the ATP5I KO cells. Chandel and colleagues already proposed that reversal of the F<sub>1</sub>F<sub>0</sub>-ATPase keeps this membrane potential in metformin-treated cells [3].

      Nevertheless, to experimentally address this point, we measured the mitochondrial membrane potential using tetramethylrhodamine methyl ester (TMRE) and ATP levels using luciferase-based assays (CellTiter-Glo) in ATP5I KO cells. We sow now that ATP levels are not significantly reduced in ATP5I KO cells, likely because of compensatory glycolysis (Figure 5D), while the mitochondrial membrane potential remains close to normal (Figure 6D and E).

      We did not measure the NAD<sup>+</sup>/NADH in both control and KO cells treated with metformin because we provide now a more direct measurement of metformin acting on ATP5I: the state of oligomerization of the F<sub>1</sub>F<sub>0</sub>-ATPase (Figure 2G) as well as a Seahorse Bioenergetic Stress test (Figure 6A-C). Both figures provide results consistent with targeting ATP5I by biguanides. We also discuss that targeting ATP5I can result in complex I inhibition due to the well-known role of F<sub>1</sub>F<sub>0</sub>-ATPases in cristae formation and the assembly of the respiratory complexes. We do not believe ATP5I is the only target of metformin and in the paper we properly acknowledged and discussed other proposed targets in the introduction, results section page 8 and the discussion.

      (5) Figure-6 CRISPR/Cas9 KO at 16mM metformin in comparison with 70nM rotenone and 2 micromolar oligomycin (in serum-containing medium). The rationale for the use of such a high concentration of metformin has not been explained. In liver cells metformin concentrations above 1mM cause severe ATP depletion, whereas therapeutic (micromolar) concentrations have minimal effects on cellular ATP status. The 16mM concentration is ~2 orders of magnitude higher than therapeutic concentrations and likely linked to compromised energy status. The stronger inhibition of cell proliferation by 16mM metformin compared with rotenone or oligomycin raises the issue of whether the changes in gene expression may be linked to the greater inhibition of mitochondrial metabolism. Validation of the cellular ATP status and NAD/NADH with metformin as compared with the two inhibitors could help the interpretation of this data.

      NALM-6 cells are very glycolytic, have low respiration rates, and weak dependence on ATP5I (DepMap score: -0.47) [4]. The concentration of 16 mM metformin was chosen based on the IC<sub>50</sub> for this cell line. Both ATP status and NAD<sup>+</sup>/NADH ratios will depend on the extent of the compensatory glycolysis. On the other hand, our genetic screening evaluates cell proliferation as an integration of all metabolic activities required for the process. This unbiased screening revealed a common pathway affected by metformin and oligomycin different that the pathway affected by rotenone, which is consistent with the finding that metformin acts of the F<sub>1</sub>F<sub>0</sub>-ATPase. Our new Seahorse data demonstrate that oligomycin has a markedly reduced effect in metformin-treated cells, supporting a shared mechanism of action. Notably, uncouplers restore respiration in both metformin-treated and ATP5I knockout cells, which aligns with the mechanism we propose (please see our new section on the Seahorse Mito Stress test and the new discussion). In the discussion, we acknowledged—based on existing literature—that the cellular context may play a significant role in determining the response to this drug.

      Reviewer #3 (Public review):

      Most of the data are based on measurements of the oxygen consumption rate (OCR) and extracellular acidification rate (ECAR) measured by the Seahorse analyser in control and ATP5l KO cells. However, these measurements are conducted by a single injection of a biguanide, followed over time and presented as fold change. By doing so, the individual information on the effect of metformin and derivate on control and KO cells are lost. In addition, the usual measurement of OCR is coupled with certain inhibitors and uncouplers, such as oligomycin, FCCP, and Antimycin A/rotenone, to understand the contribution of individual complexes to respiration. Since biguanides and ATP5l KO affect protein levels of components of complex I and IV, it would be informative to measure their individual contributions/effects in the Seahorse. To further strengthen the data, it would be helpful to obtain measurements of actual ATP levels in these cells, as this would explain the activation of AMPK.

      Thank you for this valuable comment. We have now performed the suggested analysis, which is presented in the new Figure 6. The data are consistent with our proposition that biguanides target ATP5I, but they also suggest the possibility of additional targets, such as Complex I, as proposed by other groups. Please see our new section on the Seahorse Mito Stress test and the new discussion. We also measured ATP (Figure 5D). and the mitochondrial membrane potential (Figure 6D and E). These measurements reflect the powerful compensation provided by glycolysis.

      The authors report on alterations in mitochondrial morphology upon ATP5l KO, which is measured by subjective quantifications of filamentous versus puncta structures. Fiji offers great tools to quantify the mitochondrial network unbiasedly and with more accuracy using deconvolution and skeletonization of the mitochondria, providing the opportunity to measure length, shape, and number quantitatively. This will help to understand better, whether mitochondria are really fragmented upon ATP5l KO and rescued by its re-introduction.

      Thanks for the suggestion. We used the Mitochondrial analyzer plugin from ImageJ/Fiji and redid Figure 2 and 4 and quantified details of the mitochondrial network reporting differences in branches number, length, endpoints and diameter.

      Finally, the authors report in the last part of the paper a genetic CRISPR/Cas9 KO screen in NALM-6 cells cultured with high amounts of metformin to identify potential new mediators of metformin action. It is difficult to connect that to the rest of the paper because a) different concentrations of metformin are used and b) the metabolic effects on energy consumption are not defined. They argue about the molecular function of the obtained hits based on literature and on a comparison of the pattern of genetic alterations based on treatments with known inhibitors such as oligomycin and rotenone. However, a direct connection is not provided, thus the interpretation at the end of the results that "the OMA1-DEL1-HRI pathway mediates the antiproliferative activity of both biguanides and the F1ATPase inhibitor oligomycin" while increasing glycolysis, needs to be toned down. This is an interesting observation, but no causality is provided. In general, this part stands alone and needs to be better connected to the rest of the paper.

      NALM-6 are very glycolytic, have low respiration rates, and weak dependence on ATP5I [4], forcing us to use higher concentrations of metformin to inhibit their growth. Recent results show that metformin targets PEN2 in the cytosol to increase AMPK activity, controlling both the glucose lowering and the life span extension abilities of metformin [5]. This work raises the question whether the antiproliferative and anticancer effects of metformin are due to a mitochondrial activity or are controlled by this new pathway of AMPK activation. Hence, the genetic screening was performed to unbiasedly find how metformin works. The results provide compelling evidence for mitochondria and in particular the ATP synthase as potential targets of metformin and a foundation for future studies. We added to the following text to the beginning of this section: “Several candidate targets have been reported for biguanides and our results presented so far suggest a new one. Clues about drug mechanism of action can be obtained in unbiased manner using genetic perturbation [6]. To obtain an unbiased observation of biological processes affected by metformin, we performed a genome-wide pooled CRISPR/Cas9 KO screen in NALM-6 cells cultured in the presence of metformin at a concentration affecting growth (16 mM).”

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) In Figure 1B, the total ACC antibody is missing, and the total AMPK should be replaced, especially since they claim pAMPK increases with metformin and BFB treatment. Additionally, the streptavidin pull-down image in Figure 1F needs to be resized to show the fully cropped section.

      We repeated this experiment three times and added the new figures to the supplemental data. We corrected the main figure in the manuscript with a representative blot for total ACC (Fig 1B).

      (2) Clarify whether ATP5I alone activates mitochondrial respiratory activity or if it functions in a complex with other proteins. Also, explain how metformin affects ATP5I-is it phosphorylated directly or through an upstream target

      ATP5I interacts directly with ATP5L and both proteins form part of the peripheral stack of the F<sub>1</sub>F<sub>0</sub>-ATP synthase. ATP5I and ATP5L play demonstrated roles in the dimerization of the F<sub>1</sub>F<sub>0</sub>-ATP synthase. We discussed that they may affect other functions of the enzyme as part of the peripheral stack which interact with the OSCP (oligomycin sensitivity conferring protein) located in the F1 portion of the enzyme. Further work is needed to understand how ATP5I may affect the interactions between the F0 and F1 parts of the enzyme. We did not investigate whether metformin affects the phosphorylation of ATP5I, but this remains an important question for future studies. The PhosphoSitePlus database indicates that ATP5I undergoes phosphorylation and acetylation at multiple sites, suggesting potential regulatory mechanisms worth exploring.

      (3) Ensure that all immunofluorescence (IF) images include a scale bar.

      Done

      Reviewer #2 (Recommendations for the authors):

      (1) Details of the mass spectrometry analysis and the number of peptides for the proteins identified would increase the merit of the study.

      We added a comprehensive table summarizing these findings (Figure 1- figure supplement 2). We considered all peptides and decided to perform stringent validation tests for those chosen to be further studied.

      (2) The lower NAD/NADH ratios in the ATP5I KO cell lines and the higher ratios with ATP5I expression are convincing data of the cellular redox state of these cells (with variable NDUFB8). Other data sets (e.g. OCR and ECAR and Relative growth, %) are normalized to the respective control and therefore do not show the relative effect of metformin (in control cells) to the ATP5I knock-out. The effects of metformin concentration on the NAD/NADH ratio would provide a direct measure of the extent to which metformin mimics ATP5I KO. This data would be clearer to interpret than Figure 3GHKL; Figures 5EF; S1; S2).

      We did not measure the NAD<sup>+</sup>/NADH in both control and KO cells treated with metformin because we provide now a more direct measurement of metformin acting on ATP5I: oligomerization state F<sub>1</sub>F<sub>0</sub>-ATPase and its vestigial assembly intermediates (Figure 2G) as well as a Seahorse Bionergetic Stress test (Figure 6A-C). Both figures provide results consistent with targeting ATP5I by biguanides. We also discuss that targeting ATP5I can result in complex I inhibition due to the well-known role of F<sub>1</sub>F<sub>0</sub>-ATPases oligomerization in cristae formation and the assembly of the respiratory complexes.

      (3) Figure 6: NAD/NADH data for metformin (16mM) and rotenone (70 nM) /oligomycin 2 uM) would establish whether the concentrations are "matched" to allow a comparison of their gene signatures.

      We used those concentrations based on similar effects on cell growth since the ration NAD/NADH depends on the extent of glycolytic compensation induced by blocking respiration.

      (4) Intramitochondrial accumulation of the biotin conjugate could be demonstrated in Figure 1D from competition between metformin and the biotin-conjugate.

      We appreciate the reviewer’s comment and agree that the resolution provided by fluorescence microscopy makes it challenging to pinpoint the specific mitochondrial compartment where the biotin-biguanide conjugate localizes, even with additional markers such as TOMM20 antibodies for the inner mitochondrial membrane. While it remains a possibility that the conjugate binds to the mitochondrial surface, another plausible explanation is that the biotin moiety may facilitate entry into mitochondria through a biotin-specific transporter, adding further mechanistic intricacies. Furthermore, while a competition assay with metformin might help investigate interactions with mitochondrial targets and transporters (OCT family), it would not compete for biotin-mediated transport. Thus, while we acknowledge the reviewer’s suggestion, we believe such an experiment may not provide conclusive evidence regarding the conjugate’s mitochondrial localization or mechanism of entry. Instead, we revised the manuscript to more accurately describe the findings as "mitochondrial association" rather than "mitochondrial accumulation," ensuring that our interpretation remains consistent with the resolution and limitations of the data presented.

      Reviewer #3 (Recommendations for the authors):

      In addition to my comments for the public review, the manuscript would be strengthened by the following points:

      (1) The abstract needs to be streamlined to communicate more clearly what the paper is about. The last part of the results is not mentioned and is completely disconnected from the ATP5I KO story.

      We have significantly modified our abstract to include both the genetic screening significance and our new findings on the F<sub>1</sub>F<sub>0</sub>-ATP synthase oligomerization.

      (2) Quantifications of the western blots (Figure 1B) are missing. Seems like AMPK total protein levels go down with BFB.

      We quantified the blots.

      (3) How often was the pull-down repeated (Figure 1F)? It would be also important to show this in other cell types, such as pancreatic cancer cells.

      The pull-down was an initial large-scale discovery experiment performed once. However, the findings were subsequently validated in KP-4 pancreatic cancer cells in three independent experiments. As a direct readout of metformin’s impact on ATP5I, we assessed the oligomerization state of the F1ATPase and compared the effects of metformin with those of ATP5I knockout. We show that metformin partially phenocopies the ATP5I KO phenotype, and we reproduced this effect in a second cell line, U2OS osteosarcoma cells.

      (4) Does the KO of ATP5l affect other subunits of the v-ATP5a?

      Yes—we added an immunoblot to document this in Fig. 2A. Notably, ATP5I knockout also reduces ATP5L and OSCP levels.

      (5) Does metformin and BFB itself affect mitochondrial morphology and respiration?

      To evaluate the activity of BFB in comparison with metformin, we performed immunoblot analyses of the AMPK pathway, growth assays, and microscopy-based assessment of mitochondrial morphology. These data are shown in Fig. 1B–D. A more comprehensive analysis of metformin’s effects on mitochondrial respiration has now been added as Fig. 6, using Seahorse measurements and multiple respiratory inhibitors.

      (6) Since there is a strong increase in ECAR, does this correspond to an increase in glucose uptake? Are the proteins or genes involved altered or how to explain the increased flux through glycolysis in ATP5l KO cells?

      This is a very interesting idea, as our CRISPR screen identified several genes that could potentially enhance glycolysis as a vulnerability in metformin-treated cells. In future work, we will explore this biology in greater depth.

      (7) Line 242, for easier understanding, states clearly that metformin reduces growth by x-percent.

      Yes, is a 65-fold chang. We added it to the text.

      (8) The conclusion at the end of the result section is not supported by the data or not well explained. I guess oligomycin will stop the action of metformin on vATP5l, or how to explain this?

      We clarified the conclusion.

      (1) Xian, H., Liu, Y., Rundberg Nilsson, A., Gatchalian, R., Crother, T. R., Tourtellotte, W. G., Zhang, Y., Aleman-Muench, G. R., Lewis, G., Chen, W., Kang, S., Luevanos, M., Trudler, D., Lipton, S. A., Soroosh, P., Teijaro, J., de la Torre, J. C., Arditi, M., Karin, M. & Sanchez-Lopez, E. Metformin inhibition of mitochondrial ATP and DNA synthesis abrogates NLRP3 inflammasome activation and pulmonary inflammation. Immunity 54, 1463-1477 e1411, (2021).

      (2) Hawley, S. A., Ross, F. A., Chevtzoff, C., Green, K. A., Evans, A., Fogarty, S., Towler, M. C., Brown, L. J., Ogunbayo, O. A., Evans, A. M. & Hardie, D. G. Use of cells expressing gamma subunit variants to identify diverse mechanisms of AMPK activation. Cell metabolism 11, 554-565, (2010).

      (3) Wheaton, W. W., Weinberg, S. E., Hamanaka, R. B., Soberanes, S., Sullivan, L. B., Anso, E., Glasauer, A., Dufour, E., Mutlu, G. M., Budigner, G. S. & Chandel, N. S. Metformin inhibits mitochondrial complex I of cancer cells to reduce tumorigenesis. eLife 3, e02242, (2014).

      (4) Hlozkova, K., Pecinova, A., Alquezar-Artieda, N., Pajuelo-Reguera, D., Simcikova, M., Hovorkova, L., Rejlova, K., Zaliova, M., Mracek, T., Kolenova, A., Stary, J., Trka, J. & Starkova, J. Metabolic profile of leukemia cells influences treatment efficacy of L-asparaginase. BMC Cancer 20, 526, (2020).

      (5) Ma, T., Tian, X., Zhang, B., Li, M., Wang, Y., Yang, C., Wu, J., Wei, X., Qu, Q., Yu, Y., Long, S., Feng, J. W., Li, C., Zhang, C., Xie, C., Wu, Y., Xu, Z., Chen, J., Yu, Y., Huang, X., He, Y., Yao, L., Zhang, L., Zhu, M., Wang, W., Wang, Z. C., Zhang, M., Bao, Y., Jia, W., Lin, S. Y., Ye, Z., Piao, H. L., Deng, X., Zhang, C. S. & Lin, S. C. Low-dose metformin targets the lysosomal AMPK pathway through PEN2. Nature 603, 159-165, (2022).

      (6) Bruno, P. M., Liu, Y., Park, G. Y., Murai, J., Koch, C. E., Eisen, T. J., Pritchard, J. R., Pommier, Y., Lippard, S. J. & Hemann, M. T. A subset of platinum-containing chemotherapeutic agents kills cells by inducing ribosome biogenesis stress. Nat Med 23, 461-471, (2017).

    1. eLife Assessment

      This important study introduces an experimental approach for studying Drosophila oviposition rhythms and identifies the subset of circadian clock neurons that mediate the circadian control of oviposition. The authors resolve an inherently noisy rhythm to provide convincing evidence by using statistical averaging techniques, which help reduce this noise but at the cost of variation across individual rhythms. This paper will be of interest to anyone interested in insect ovarian physiology, circadian biology, and reproductive fitness.

    2. Joint Public Review:

      Summary

      Riva et al. introduce a semi-automatic setup for measuring Drosophila melanogaster oviposition rhythms and use it to map the timekeeping function underlying egg laying rhythms to a subset of clock cells. Using a combination of neurogenetic manipulations and referencing the publicly available female hemi-brain connectome dataset, they narrow the critical circuit down to two of the three CRYPTOCHROME expressing lateral-dorsal neurons (CRY[+] LNds). Their findings suggest that different overlapping sets of clock neurons may control different behavioral rhythms in D. melanogaster.

      This work will be of interest to researchers interested in the circadian regulation of oviposition in D. melanogaster (and possibly other insects), a phenomenon which has been left relatively under-explored. The construction of a semi-automated setup which can be made relatively cheaply using available motors and 3D printed molds provides a useful model for obtaining longer records of oviposition activity.

      Strengths

      The authors use a semi-automated monitoring system to detect circadian egg laying rhythms in spite of inherently noisy data. Using this approach they use a variety of different genetic tools to show that CRY+ LNds play a role in generating the circadian rhythm of oviposition, that PDF-expressing neurons seem to be important for maintaining the circadian period of egg laying, and that period locus function is required for the circadian rhythmicity of oviposition. The authors also point to some potentially interesting connectome data that suggest hypotheses regarding the neuronal circuit linking daily timekeeping to oviposition, which will require further validation in future studies.

      Weaknesses:

      The major weaknesses of this work result from the noisy nature of the data, and the need to average the individual records of many animals in order to extract significant rhythmicity values. The predicted neural output pathways will require validation in future studies.

    3. Author response:

      The following is the authors’ response to the previous reviews

      Joint Public Review:

      (1) Problems associated with averaging: The authors intended to focus on the oviposition clock in individual females, however due to the inherent noise in the oviposition rhythm they had to resort to averaging across Lomb-Scargle periodograms generated from individual time-series. They then tested whether the averaged periodogram contains a significant frequency. However, this reduction in noise also reduces the ability to compare differences in power of the rhythm across individuals. Furthermore, this method makes it especially difficult to distinguish the contribution of subsets of the circuit on the proportion of rhythmic flies and the power of the rhythm. In this revised version the authors use two manipulations to disrupt the molecular clock, which could have different success rates based on the type and number of cells targeted. Unfortunately, the type of averaging used prevents the detection of any such effects. It is to be noted that, indeed, individual-level differences in period between the PdfDicerGal4 > perRNAi and UAS-perRNAi lines help the authors to establish that there is a significant reduction in period length when the molecular clock is abolished in PDF cells. These individual measurements are now very helpful in discerning the effect of manipulations carried out on different circadian neural subsets, some of which could have been missed if only averages were considered.

      First, it is important to emphasize that we are certainly not "averaging across LombScargle periodograms". As explained in the paper (and at length in the Supplementary Material), what we do is first to detrend each individual time series, then average _all_ the resulting time series (and not only those of rhythmic individuals), and finally take the Lomb-Scargle periodogram of this average series. Nevertheless, we agree with the reviewer in that the use of averages reduces our ability of understanding what happens at the individual level. The problem is that in most cases the presence of noise has made it difficult to draw any meaningful conclusions. One fortunate exception is the one mentioned by the reviewer. Averaging, on the other hand, has allowed us to extract some useful information in those cases.

      (2) Sensitivity to sample size: Averaging reduces the effect of random background noise but noise reduction is dependent upon sample size. Comparing genotypes with different sample sizes in addition to varying signal to noise ratios (which might also change with neural manipulations) makes it difficult to estimate how much of the rhythm structure is contributed by a given neuronal subset; thus, whenever possible comparisons should be made between groups that include similar number of flies. This problem is compounded when the averaged periodogram is composed of both rhythmic and weakly rhythmic individuals. For instance, in the main text the reported value of period length of pdfDicerGal4 > perRNAi is 20.74h (see also Fig 2J) but in the Supplementary figure 2S1 this is close to 22h, while the values reported for the control are largely similar (24.35h in Fig 2H versus ~24h in Fig 2S1). A difference of 3.6h between control and experimental flies is much greater than 2h. Which estimate (average versus individual) is more reliable in predicting the behavior of these flies is difficult to determine without further experiments.

      In most of the experiments analyzed for this paper the number of flies for control and experimental genotypes are very similar. In the remaining ones, the number of flies for experimental genotypes is roughly twice the number of flies for control genotypes. As mentioned, noise reduction depends on sample size. This implies that, when a genotype is assessed as rhytyhmic the sample size used is evidently large enough. On the other hand, when a genotype is assessed as arrhythmic it is important to know if sample size is large enough. It is for this reason that we have used many more flies for arrhythmic genotypes vs. their control genotypes.

      Regarding the period difference between the average of rhythmic individuals, and the population denoised average, notice first that they are not necessarily excactly the same thing, since our population average uses all flies, and the denoising might introduce some variations over the underlying periods (which would be undetectable without the denoising). Also, and more importantly, Fig. 2S1 shows that for the average of the individual periods the error bars are large, and thus statistically, the reported value for the population average falls within the confidence interval for the individual average.

      (3) Based on the newly provided data for individual fly periodograms the reader can visually evaluate the rhythmicity associated with each genotype. Such visual inspection did not reveal any clear difference between the proportion of rhythmic individuals between experimental and parental GAL4 and/or UAS controls, except for experiments using per01 mutant animals. This is surprising since if these circuits are controlling the oviposition rhythm, perturbing them should affect most individuals in a similar way.

      The problem here is that, given the amount of noise present in this behavior, it is difficult to obtain any reliable information from individual records, since, by its random nature, in a given experiment noise might be disturbing the expected behavior of individuals in very different ways. That is the reason why we have resorted to population averages.

      Other comments

      Disrupting the clock in the 5th sLNv and 3 Cry+ LNds (and weakly in a small subset of DN1) affected egg-laying. Although the work emphasizes the importance of the LNd, the role of the 5th sLNv's role should be discussed.

      As mentioned in the paper, what the experiments show is that the 3 Cry+ LNds and 5th sLNv (usually called E cells) are candidates to be the main drivers of the oviposition rhythm, but the connectomics show that only 2 Cry+ LNds are connected to the oviposition circuit. In order to be more accurate, throughout the corresponding section (now called "The molecular clock in E neurons is necessary for rhythmic egg-laying") of the corrected manuscript we have always referred to the cells marked by the driver as E-cells. In the Discussion, we have added a line commenting that, in the connectome, the 5th sLNv is not connected to any cells of the oviposition circuit.

      Minor corrections:

      In subsection "Two Cry+ LNd neurons directly oviIN", there was a mistake in the use of "E1" and "E2" (their meanings were interchanged). We have corrected this section, giving the correct definitions. We have also corrected some minor english typos.

      Joint Recommendations for the authors:

      (1) Line 234 'to disrupt the molecular clock in (those) neurons', Please clearly describe the cell types in which MB122B driver works.

      We have clarified the cell types in which MB122B driver is expressed (line 236)

      (2) Line 235 gen cycle, should be gen'e' cycle

      The typo has been corrected

      (3) The authors should provide the raw data in repositories as per journal policy of eLife.

      The data are now available at the following links:

      https://github.com/srisaug/flywork/blob/main/RawData_Rivaetal_eLife2025_Fig4_+> UAS-perRNAi.zip

      https://github.com/srisaug/flywork/blob/main/RawData_Rivaetal_eLife2025_Fig4_M 122Bsplit-Gal4>+.zip

      https://github.com/srisaug/flywork/blob/main/RawData_Rivaetal_eLife2025_Fig4_MB122Bsplit-Gal4>UAS-perRNAi.zip

      https://github.com/srisaug/flywork/blob/main/RawData_Rivaetal_eLife2025_Figures1

    1. eLife Assessment

      This study presents valuable computational findings on the neural basis of learning new motor memories and the savings using recurrent neural networks. The evidence supporting the claims of the authors is solid, but it would benefit from more detailed discussion on the specific conditions under which savings emerges from purely implicit mechanisms. This work will be of interest to computational and experimental neuroscientists working in motor learning.

    2. Reviewer #2 (Public review):

      Summary:

      Shahbazi et al. trained recurrent neural networks (RNNs) to simulate human upper limb movement during adaptation to a force field perturbation. They demonstrated that throughout adaptation, the pattern of motor commands to the muscles of the simulated arm changed, allowing the perturbed movements to regain their typical, perturbation-free straight-line paths. After this initial learning block (FF1), the network encountered null-fields to wash out the adaptation, before re-experiencing the force in a second learning block (FF2). Upon re-exposure, the network learned faster than during initial learning, consistent with the savings observed in behavioral studies of adaptation. They also found that as the number of hidden units in the RNN increased, so did the probability of exhibiting savings. The authors concluded that these results propose a neural basis for savings that is independent of context and strategic processes.

      Strengths:

      The paper addresses an important and controversial topic in motor adaptation: the mechanism underlying motor memory. The RNN simulation reproduces behavioral hallmarks of adaptation, and it provides a useful illustration of the pattern of muscle activity underlying human-like movements under both normal and perturbing conditions. While the savings effect produced by the network, though significant, appears somewhat small, the simulation demonstrating an increase in savings with a greater number of hidden units is particularly intriguing.

      Main weakness:

      The introduction details the ongoing debate in the literature regarding the mechanisms underlying savings, particularly whether it stems from explicit or implicit learning processes. However, it remains unclear how the current work addresses this debate. There is already a considerable body of research, particularly in visuomotor adaptation, demonstrating that savings is predominantly driven by explicit strategies (e.g., Morehead et al. 2015, Haith et al., 2015; Huberdeau et al., 2019; Avraham et al., 2021). Furthermore, there have been multiple reports that implicit adaptation exhibits attenuation upon relearning (Avraham et al., 2021, Leow et al., 2020; Yin and Wei, 2020; Hamel et al., 2021; Hamel et al., 2022; Wang and Ivry, 2023; Hadjiosif et al., 2023). In the discussion, the authors acknowledge that their goal was not to model a complete explicit-implicit system, but rather to probe how savings may emerge from a purely implicit mechanism. Given the central debate introduced by the authors, the manuscript would benefit from a more detailed discussion explaining how their findings elucidate the specific conditions under which savings emerge from purely implicit mechanisms versus when cognitive strategies predominate.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Shahbazi et al used a recurrent neural network model trained to control a musculoskeletal model of the arm to investigate how neural populations accommodate activity patterns underpinning savings. The paper draws upon the recent finding of a "uniform shift" in preparatory activity in monkey motor cortex associated with savings, and leverages full access to a computational model to establish causality.

      Strengths:

      The paper is well written, and the figures are clearly presented. The key finding that the uniform shift first reported based on neural recordings by Sun et al. emerges in artificial neural networks performing a similar task is interesting and well-backed by their analyses. Manipulating this uniform shift to show that it drives behavioural savings is an important causal confirmation of the proposal by Sun et al.

      Weaknesses / Comments:

      As mentioned earlier, the core results are well backed by the analyses. Most of my comments relate to adding more controls and additional questions that could be explored with the model to strengthen the paper.

      (1) Savings are quantified as more rapid relearning of the FF upon re-exposure (e.g., Figure 3). This finding is based on backpropagation through time, but would this hold when using a different optimiser, e.g., FORCE?

      This is an interesting question, and indeed, there are an increasing number of studies addressing how different neural network learning rules may affect the kinds of representations that arise after learning (Codol et al., 2024). However the focus of the present paper is not on which neural network approaches or which specific optimisers produce savings, rather, the focus is on the basis and neural geometry of savings when it emerges.

      We have added a short paragraph to the Discussion section [lines 349-355] to address this:

      “The present results are based on RNNs trained in an error-based approach using backpropagation through time (Werbos, 1990) using the Adam optimizer (Kingma and Ba, 2014). Other techniques for training RNNs have been proposed including the FORCE algorithm (Sussillo and Abbott, 2009). In addition, several recent reports have demonstrated success using reinforcement learning approaches to train neural networks in the context of sensorimotor control tasks (Lillicrap et al., 2015; Codol et al., 2024a). An interesting avenue for future work is to determine how the present results may or may not generalize to different neural network architectures and learning rules.”

      (2) The authors should include a "null model" showing that training on a different reaching task following NF, as opposed to FF2, won't show something akin to a uniform shift during preparation due to the adoption of TDR and having similar targets.

      This is a critical point. Training on a different reaching task other than FF2 (e.g. a different force field) will indeed result in a uniform shift, but critically, a shift in a different direction in neural state space than the uniform shift associated with FF2. The central focus of the present paper is to show that when there remains a non-zero projection of preparatory neural activity along the direction of the uniform shift associated with a given learning task, this residual projection underlies savings when networks are subsequently re-exposed to the same task.

      In the Results section we had included a short paragraph to describe control simulations that we performed that address this concept. We have expanded this text and added a Figure and the results of statistical tests to better describe this control [lines 179-187]:

      “As an additional control we trained networks after the growing up phase on an opposing force field (CCW) and then as above, exposed the networks to a NF washout phase, and then to a CW force field. In this case no savings was observed in the CW force field, either for initial lateral deviation, or for learning rate (Figure 3). In fact, we observed that initial lateral deviation is larger for the novel force field (t(39)=-4.918, p=1.6e-5). This observation is in line with the finding that learning opposing force fields sequentially results in interference (Sun et al., 2022). The results of these control simulations underscore that the savings effect observed in our main study was learning-specific—it was due to prior learning of the CCW force field, and not a general effect of learning any novel dynamics.”

      (3) The analyses of network activity during movement preparation (Figure 4) nicely replicate the key finding in Sun et al, but I think the authors could leverage the full access to their network and go further, e.g., by examining changes (or the lack of) during execution in FF2 with respect to FF (and perhaps in a future NF2 with respect to NF), including whether execution activity lives also lives in parallel hyperplanes, etc.

      We agree that a visualization of the neural activity during movement would be beneficial to the reader. To address this we have added a new Figure (Fig. 6) and associated text [lines 210-219]. The Figure shows the neural trajectories when the RNNs are first exposed to the FF1 and when they are first exposed to FF2 (after NF2 washout). Trajectories are plotted in 3D corresponding to the first 3 principal components, starting at the go cue and ending 200 ms into the movement, for each of the 8 movement targets.

      “The neural trajectories for preparation and for movement can be visualized in principal component space. Figure 6 shows trajectories during planning and early execution for initial FF1 and FF2 exposure. Hidden unit activity was subjected to a principal components analysis, and neural trajectories within the first three PCs are shown for movements to each of the eight movement targets. Filled circles indicate neural state 200 ms prior to the go cue. During the preparatory period trajectories travel along PC1 and then disperse across PC2 and PC3 into the circular pattern indicated by the filled stars, which indicate time of the go cue (also see Figure 5A). After the go cue neural trajectories shift back along PC1 and rotate along oscillatory patterns characteristic of populations of motor cortical neurons in non-human primates during movement (Churchland and Shenoy, 2024).”

      (4) Related to the above, while the results are interesting and the paper is well done, I kept wishing that the authors had done "more" with their model. This could be one or two final sections on "predictions" that would nicely complement their "validation" of the uniform shift, and that, in my opinion, would greatly increase the impact of the paper. In particular:

      (a) What would be the effect of learning more "tasks"? For example, is there a limit on how many fields can be learned? (You show something related by manipulating network size, but this is slightly different.)

      These are interesting questions and to some extent they are already addressed in the paper. Of course, the number of tasks that a network is able to learn, will be related to how much those tasks overlap in a control space. Indeed, this idea goes back to early theoretical accounts of connectionist models such as Hopfield nets and capacity for representing information (Hopfield, 1982; Hopfield et al., 1983). The control simulations that we described in the paper [lines 179-187 and Figure 4] are a test of one extreme version of this, in which two tasks are in direct opposition to each other (opposite force fields), and in this situation no savings emerges. We believe it is an interesting question, but beyond the scope of the present paper to undertake a comprehensive exploration of the nature of task-overlap in upper limb reaching learning tasks.

      (b) Figure 5 is a nice causal demonstration that the uniform shift is related to savings. However, and related to comment #3, it'd be interesting to see more details about how the behaviour and the network activity changes as preparatory activity shifts along this axis, in particular regarding how moving the preparatory states affect the organisation and dynamics of upcoming execution activity -these are the kind of intuitions that modelling studies like this one can provide.

      This has been addressed above by the changes we made to address the reviewer’s comment #3.

      (c) The authors focus on a task design that spans baseline, FF, NF, FF2 to replicate the original study by Sun et al. However, it would be interesting if they generated predictions for neural changes to other types of tasks that have been studied behaviourally. These could include, for example: (i) modelling a visuomotor rotation or a mirror reversal task; (ii) having to adapt to a FF in the opposite direction; (iii) investigating the role of adding an explicit context and having the networks learn multiple FF; and (iv) trying to learn FF fields in opposite directions, perhaps restricted to specific targets. As the authors know, all these questions and more have been studied with similar behavioural paradigms, and it would be nice to see what neural predictions are generated by this model.

      See responses above e.g. to comment 4. We have clarified the text and provided a new Figure to illustrate our opposite FF control simulations. The other suggestions about visumotor rotations, and contextual cues, are interesting and potentially important questions that we are working on, but we believe are beyond the scope of the current paper which is focused specifically around the question of savings in FF learning.

      (5) On the Discussion: When extrapolating from neural network results to animals, the fact that your networks can learn implicitly doesn't mean that animals do learn implicitly. Indeed, I think the consensus view is that different perturbations may lead to the expression of different types of savings (e.g., FF vs VR, which seems to be more explicit). Besides, these different mechanisms may be primarily implemented by brain regions less directly tied to motor control (e.g., cerebellum, parietal cortex?), which are not directly implemented in the authors' model.

      Of course the reviewer is correct that our simulations are not evidence that savings in motor tasks learned by animals is only implicit, and we do not make any such claims in the paper. The model we describe in the present paper is not meant to be a comprehensive model of motor learning in humans/animals. Indeed, the pure “context free” type of learning that we implement in our simulations basically cannot occur in animals, because there is always some information that provides contextual information. Indeed there are computational models of motor learning that include these effects, e.g. the COIN model (Heald et al., 2021). Our model however provides a useful window into what the context-free component of savings may look like. The approach we describe in the present paper is a powerful way to probe the context-free component of savings in isolation in a way that is not possible (at least not readily) in animals/humans. We have modified the text in the Discussion [lines 372-379] to better articulate this point.

      “The simulations described here do not constitute evidence that savings in motor learning tasks is exclusively implicit in animals and humans. The purely context-free learning implemented in our simulations is highly unrealistic, as some form of contextual information is invariably available. Indeed, computational models of motor learning that incorporate contextual effects already exist, e.g. (Heald et al. 2021). Nevertheless, our simulations provide a useful window into what the context-free component of savings may look like. This approach offers a powerful means of probing the context-free component of savings in isolation—something that is not readily achievable in animal or human experiments.”

      Reviewer #2 (Public review):

      Summary:

      Shahbazi et al. trained recurrent neural networks (RNNs) to simulate human upper limb movement during adaptation to a force field perturbation. They demonstrated that throughout adaptation, the pattern of motor commands to the muscles of the simulated arm changed, allowing the perturbed movements to regain their typical, perturbation-free straight-line paths. After this initial learning block (FF1), the network encountered null-fields to wash out the adaptation, before re-experiencing the force in a second learning block (FF2). Upon re-exposure, the network learned faster than during initial learning, consistent with the savings observed in behavioral studies of adaptation. They also found that as the number of hidden units in the RNN increased, so did the probability of exhibiting savings. The authors concluded that these results propose a neural basis for savings that is independent of context and strategic processes.

      Strengths:

      The paper addresses an important and controversial topic in motor adaptation: the mechanism underlying motor memory. The RNN simulation reproduces behavioral hallmarks of adaptation, and it provides a useful illustration of the pattern of muscle activity underlying human-like movements under both normal and perturbing conditions. While the savings effect produced by the network, though significant, appears somewhat small, the simulation demonstrating an increase in savings with a greater number of hidden units is particularly intriguing.

      Weaknesses:

      (1) To be transparent, savings in motor adaptation have been a primary focus of my own research. Some core findings presented in this paper are at odds with the ideas I and others have previously put forward. While I don't want to impose my agenda on the authors of this paper, I do think the authors should address these issues.

      (a) The authors acknowledge the ongoing debate in the literature regarding the mechanisms underlying savings, particularly whether it stems from explicit or implicit learning processes. However, it remains unclear how the current work addresses this debate. There is already a considerable body of research, particularly in visuomotor adaptation, demonstrating that savings is predominantly driven by explicit strategies. For example, when people are asked to report their strategy, they recall a strategy that was useful during the first learning block (Morehead et al. 2015). Furthermore, savings are abolished under experimental manipulations designed to eliminate strategic contributions (e.g., Haith et al., 2015; Huberdeau et al., 2019; Avraham et al., 2021). The authors briefly state that their findings support the hypothesis that a neural basis of memory retention underlying savings can be independent of cognitive or strategic learning components, and that savings can be characterized as implicit. While these statements may be true, it is not clear how this work substantiates these claims.

      We have addressed a similar point raised by Reviewer 1, see point #5 above. Our work represents an example of how savings can occur from implicit mechanisms in the absence of explicit contextual cues. Our goal is not to resolve the debate about how this occurs in humans/animals. Rather, our model provides a useful window into what the context-free component of savings may look like. Our approach is a powerful way to probe the context-free component of savings in isolation in a way that is not possible (at least not readily) in animals/humans. We have modified the text in the Discussion [lines 372-379] to better articulate this point.

      “The simulations described here do not constitute evidence that savings in motor learning tasks is exclusively implicit in animals and humans. The purely context-free learning implemented in our simulations is not meant to be a full model of biological learning, as in biological systems some form of contextual information is invariably available. Indeed, computational models of motor learning that incorporate contextual effects already exist, e.g. (Heald et al. 2021). Nevertheless, our simulations provide a useful window into what the context-free component of savings may look like. This approach offers a powerful means of probing the context-free component of savings in isolation—something that is not readily achievable in animal or human experiments.”

      (b) Our research has also demonstrated that if implicit adaptation is completely washed out after the initial learning block, it not only fails to exhibit savings but is actually attenuated relative to the first learning block (Avraham et al., 2021). This phenomenon of attenuation upon relearning can also be seen in other studies of visuomotor adaptation (e.g., Leow et al., 2020; Yin and Wei, 2020; Hamel et al., 2021; Hamel et al., 2022; Wang and Ivry, 2023; Hadjiosif et al., 2023). More recently, we have shown that this attenuation is due to anterograde interference arising from the experience with the washout block experience (Avraham and Ivry, 2025). We illustrated that the implicit system is highly susceptible to interference; it doesn't require exposure to salient opposite errors and can occur even following prolonged exposure to veridical feedback. The central thesis of this paper, namely that implicit savings can emerge through RNNs, is at odds with these empirical results. The authors should address this discrepancy.

      These empirical results are interesting and intriguing, and we agree that they are relevant in the context of the debate about the relative contributions and interactions between explicit and implicit learning systems and savings. Importantly, contextual interference is impossible in our model, since there are no contextual cues about which force field is present or absent. Interactions between an explicit system and an implicit learning system are also impossible in our model, since there is no possibility of context-driven explicit learning or memory. The approach we have taken in the present paper is not to model a full explicit plus implicit learning system but rather to probe how savings may emerge from a purely implicit learning mechanism alone and to compare the neural geometry underlying this implicit-drive savings to the neural recording results from monkey electrophysiology studies. Nevertheless we have added some text to the Discussion [lines 380-391] to situate our findings in the context of the studies mentioned above by the reviewer.

      “Recent empirical work suggests that relearning after washout of implicit adaptation can be attenuated rather than facilitated, a phenomenon attributed to anterograde interference from the washout phase (Avraham et al., 2021; Hadjiosif et al., 2023; Hamel et al., 2022, 2021; Leow et al., 2020; Wang and Ivry, 2025; Yin and Wei, 2020). The savings observed in our simulations differs from these behavioral findings. Crucially, our model excludes both contextual interference (since no cues signal which force field is present) and explicit-implicit interactions (since context-driven explicit learning is absent). Our goal was not to model a complete explicit-implicit system, but rather to probe how savings may emerge from a purely implicit mechanism and to compare the underlying neural geometry to monkey electrophysiology data. Our results suggest that high-dimensional neural circuits possess an intrinsic capacity for savings via persistent preparatory traces. How and when this capacity may be masked by interference or explicit-implicit interactions in biological systems remains an open question for future work.”

      (2) This brings me to the question about neural correlates: The results are linked to activity in the primary motor cortex. How does that align with the well-established role of the cerebellum in implicit motor adaptation? And with the studies showing that savings are due to explicit strategies, which are generally associated with prefrontal regions?

      The modeling approach we use in the present paper is area agnostic, and we do not include different neural modules to represent specific brain areas such as cerebellum or prefrontal regions. In the current approach we specifically exclude explicit strategies, as a way to specifically probe implicit mechanisms alone. Also see response to reviewer 1 comment 5 above.

      (3) The analysis on the complexity of the neural network (i.e., the number of hidden units) and its relationship to savings is very interesting. It makes sense to me that more complex networks would show more savings. I'm not sure I follow the author's explanation, but my understanding is that increased network complexity makes it more difficult to override the formed memory through interference (e.g., from the experience with NF2). Also, the results indicate that a network with 32 units led to a less-than-chance level of networks exhibiting savings (Figure 3b). What behavioral output does this configuration produce? Could this behavior manifest as attenuation upon relearning? Furthermore, if one were to examine an even smaller, simpler network (perhaps one more closely reflecting cerebellar circuits), would such a model predict attenuation rather than savings?

      These are interesting questions, and are potentially important, for future work to explore. Our interpretation of the results of smaller networks is that these small RNNs fail to show savings presumably because the learned FF behavior is 'erased' during washout because of the limited capacity to retain the FF learning in a distinct neighborhood in neural state space. Our paper is focused specifically on the relationship between savings, implicit learning, and neural capacity via network size, in the context of the monkey electrophysiology results in motor cortex. It would be interesting in future work to explore a cerebellar-like modeling approach.

      (4) The authors emphasize that their network did not receive any explicit contextual signals related to the presence or absence of the force field (FF), thus operating in a 'context-free' manner. From my understanding, some existing models of context's role in motor memories (e.g., Oh and Schweighofer, 2019; Heald et al., 2021) propose that memory-related changes can be observed even without explicit contextual information, as contextual changes can be inferred from sudden or significant environmental shifts (e.g., the introduction or removal of perturbations). Given this, could the observed savings in the current simulation be explained by some form of contextual retrieval, inferred by the network from the re-presentation of the perturbation in FF2?

      It is important to note that this is not possible in the context of the modeling approach described in the present paper. For example, in trial 1 of FF2, because the network has no contextual cue signaling the FF’s presence, the network has no information before movement begins that a FF will be present during movement (recall that the FF is velocity-dependent, and so is zero before movement begins). Once the network encounters the FF during movement, some component of its response I suppose could be described as contextual inference derived from effector state (similar to the account described in the COIN model), but strictly speaking the model is only responding to what it encounters in the moment. Any change in behaviour due to prior learning (e.g. savings) is due to the interaction between the residual learning-related neural state (e.g. the uniform shift), the effector state in the moment, and the errors encountered during movement. We don’t interpret this as “inference” in the traditional sense of an explicit learning system.

      (5) If there is residual hidden unit activity related to the FF at the end of the NF2 phase, how does the simulated movement revert back to baseline? Are there any differences in the movement trajectory, beyond just lateral deviation, between NF1 and NF2? The authors state that "changes in the preparatory hidden unit activity did not result in substantive changes in the motor commands (Figure 5b), which emphasizes that the uniform shift resides in the null space of motor output." However, Figure 5b appears to show visible changes in hidden unit activity. Don't these changes reflect a pattern of muscle activity that is the basis for behavior? These changes are indeed small, but it seems that so is the effect size for savings (Figure 3a). Could this suggest that there is not, in fact, a complete washout of initial learning during NF2 within the network?

      This is precisely the point of the paper, i.e. to show that neural activity during the preparatory period before movement onset is different, even though the behaviour during the preparatory period is the same (i.e. no muscle activity and no movement). This recapitulates the empirical findings from the neural data reported in the Sun et al. (2022) paper.

      The reviewer asks “Don't these changes reflect a pattern of muscle activity that is the basis for behavior?” Yes indeed they do, but not during the NF and not during the preparatory activity prior to movement onset.

      The reviewer asks “Could this suggest that there is not, in fact, a complete washout of initial learning during NF2 within the network?” We addressed this in the paper (Results/Washout) by comparing kinematics after washout to that prior to FF learning; e.g. any differences in lateral deviation of the hand path for the entire reach trajectory was in the range of 0.1 mm, which is less than 0.25 % of the lateral deviation encountered in the FF and only 0.1 % of the reach distance (10 cm).

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      (1) Figure 1c, lower panel: Is this from the early or late stage of FF1?

      This is an example movement after learning in a null field (NF). We have clarified this in the Figure caption.

      (2) Please clarify what the two panels in Figure 1e represent.

      We have clarified in the Figure caption that these are activity from two example hidden units.

      (3) If Figure 2c is intended to illustrate the changes in motor commands for individual muscles, consider reorganizing the plots by muscle to more clearly show the change for each muscle from NF1 to FF1.

      The point here is not to make fine-grained comparisons between specific muscles, rather to show a general example of how muscle activity is different. For the sake of visual simplicity in a Figure that already has many components we have decided to keep Figure 2c the same.

      (4) The text mentions that no savings were observed when the network was trained on CCW followed by CW perturbations. However, no data or statistical analysis is presented to support this claim. I wonder if the authors would expect attenuated learning when exposed to the CW perturbation, given a memory of the opposite perturbation.

      We have added a Figure to provide data for the FF opposite control.

      (5) The relevance of the discussion on choking under pressure to the paper wasn't clear.

      We have modified the relevant text in the Discussion section [lines 356-363] to clarify the relevance of the present work to other recent work on how complex features of motor behaviour can arise due to the dynamics of preparatory neural activity in motor cortex.

      References

      Avraham G, Morehead JR, Kim HE, Ivry RB. 2021. Reexposure to a sensorimotor perturbation produces opposite effects on explicit and implicit learning processes. PLoS Biol 19:e3001147. doi:10.1371/journal.pbio.3001147

      Codol O, Krishna NH, Lajoie G, Perich MG. 2024. Brain-like neural dynamics for behavioral control develop through reinforcement learning. bioRxiv. doi:10.1101/2024.10.04.616712

      Hadjiosif AM, Morehead JR, Smith MA. 2023. A double dissociation between savings and long-term memory in motor learning. PLoS Biol 21:e3001799. doi:10.1371/journal.pbio.3001799

      Hamel R, Dallaire-Jean L, De La Fontaine É, Lepage JF, Bernier PM. 2021. Learning the same motor task twice impairs its retention in a time- and dose-dependent manner. Proc Biol Sci 288:20202556. doi:10.1098/rspb.2020.2556

      Hamel R, Lepage J-F, Bernier P-M. 2022. Anterograde interference emerges along a gradient as a function of task similarity: A behavioural study. Eur J Neurosci 55:49–66. doi:10.1111/ejn.15561

      Heald JB, Lengyel M, Wolpert DM. 2021. Contextual inference underlies the learning of sensorimotor repertoires. Nature 600:489–493. doi:10.1038/s41586-021-04129-3

      Hopfield JJ. 1982. Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci U S A 79:2554–2558. doi:10.1073/pnas.79.8.2554

      Hopfield JJ, Feinstein DI, Palmer RG. 1983. “Unlearning” has a stabilizing effect in collective memories. Nature 304:158–159. doi:10.1038/304158a0

      Leow L-A, Marinovic W, de Rugy A, Carroll TJ. 2020. Task errors drive memories that improve sensorimotor adaptation. J Neurosci 40:3075–3088. doi:10.1523/JNEUROSCI.1506-19.2020

      Wang T, Ivry RB. 2025. Contextual effects during sensorimotor adaptation are an emergent property of population coding in a cerebellar-inspired model. Sci Adv 11:eadr4540. doi:10.1126/sciadv.adr4540

      Yin C, Wei K. 2020. Savings in sensorimotor adaptation without an explicit strategy. J Neurophysiol 123:1180–1192. doi:10.1152/jn.00524.2019

    1. eLife Assessment

      This study provides compelling evidence that action potential (AP) broadening is not a universal feature of homeostatic plasticity in response to chronic activity deprivation. By leveraging state-of-the-art methods across multiple brain regions and laboratories, the authors demonstrate that AP half-width remains largely stable, challenging previous assumptions in the field. These important findings help resolve longstanding inconsistencies in the literature and significantly advance our understanding of neuronal network homeostasis. The authors have clarified methodological differences with prior work and expanded the discussion of potential mechanisms, strengthening the interpretation of the findings without altering the central conclusions.

    2. Reviewer #1 (Public review):

      [Editors' note: The Reviewing Editor has assessed the revised manuscript without seeking further input from the original reviewers. The authors have addressed the main points raised during peer review, including clarifying methodological differences with prior work, providing additional analysis, and expanding the discussion of potential mechanisms. These revisions strengthen the interpretation and presentation of the findings, and the conclusions remain supported by the data.]

      Summary:

      Ritzau-Jost et al. investigate the potential contribution of AP broadening in homeostatic upregulation of neuronal network activity with a specific focus on dissociated neuronal cultures. In cultures obtained from a few brain regions from mice or rats using different culture conditions and examined by different laboratories, AP half-width remained stable despite chronic activity block with TTX. The finding suggests that AP width is not significantly modulated by changes in sodium channel activity.

      Strengths:

      The collaborative nature of the study amongst the neuronal culture experts and the rigorous electrophysiological assessments provides for a compelling support of the main conclusion.

    3. Reviewer #2 (Public review):

      Summary:

      This study reexamined the idea that action potential broadening serves as a homeostatic mechanism to compensate for changes in network activity. The key finding was that, while action potential broadening does occur in certain neurons - such as CA3 pyramidal cells-it is far from a universal response. This is important because it helps resolve longstanding discrepancies in the field, thereby contributing to a better understanding of network dynamics. The replication of these findings across multiple laboratories further strengthened the study's rigor.

      Strengths:

      Mechanisms of network homeostasis are essential to understand network dynamics.

    4. Reviewer #3 (Public review):

      Summary:

      The manuscript "Unreliable homeostatic action potential broadening in cultured dissociated neurons" by Ritzau-Jost et al. investigates action potential (AP) broadening as a mechanism underlying homeostatic synaptic plasticity. Given the existing variability in the literature concerning AP broadening, the authors address an important and timely research question of considerable interest to the field.

      The study systematically demonstrates cell-type- and model-specific AP broadening in hippocampal neurons after chronic treatment with either tetrodotoxin (TTX) or glutamatergic transmission blockers. The findings indicate AP broadening in CA3 pyramidal neurons in organotypic cultures after TTX treatment, but notably not in dissociated hippocampal neurons under identical conditions. However, blocking glutamatergic neurotransmission caused AP broadening in dissociated hippocampal neurons. Moreover, extensive evaluations in neocortical dissociated cultures robustly challenge previous findings by revealing a lack of AP broadening following TTX treatment. Additionally, the proposed role of BK-type potassium channels in mediating AP broadening is convincingly questioned through complementary electrophysiological and voltage-imaging experiments.

      Strengths:

      The manuscript exhibits an outstanding experimental design, employing state-of-the-art techniques and a rigorous multi-lab validation approach that greatly enhances scientific reliability. The experimental results are meticulously illustrated, and the conclusions drawn are justified and supported by the presented data. Furthermore, the manuscript is comprehensively and clearly written.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Ritzau-Jost et al. investigate the potential contribution of AP broadening in homeostatic upregulation of neuronal network activity with a specific focus on dissociated neuronal cultures. In cultures obtained from a few brain regions from mice or rats using different culture conditions and examined by different laboratories, AP half-width remained stable despite chronic activity block with TTX. The finding suggests that AP width is not significantly modulated by changes in sodium channel activity.

      Strengths:

      The collaborative nature of the study amongst the neuronal culture experts and the rigorous electrophysiological assessments provides for a compelling support of the main conclusion.

      Weaknesses:

      Given the negative nature of the results, a couple of remaining issues (such as the cell density of cultures and the presentation of imaging experiments with a voltage sensor) warrant further consideration. In addition, a discussion of the reasons for the I stability of AP half-width to sodium channel modulation might help extend the scope of the study beyond the presentation of a negative conclusion.

      We would like to thank the reviewer for positively evaluating our manuscript. Please find below our detailed point-to-point response to the reviewer’s comments.

      Reviewer #2 (Public review):

      Summary:

      This study reexamined the idea that action potential broadening serves as a homeostatic mechanism to compensate for changes in network activity. The key finding was that, while action potential broadening does occur in certain neurons - such as CA3 pyramidal cells-it is far from a universal response. This is important because it helps resolve longstanding discrepancies in the field, thereby contributing to a better understanding of network dynamics. The replication of these findings across multiple laboratories further strengthened the study's rigor.

      Strengths:

      Mechanisms of network homeostasis are essential to understand network dynamics.

      Weaknesses:

      No weaknesses were noted by this reviewer.

      We would like to thank the reviewer for the positive evaluation of our manuscript. Please find below our detailed point-to-point response to the reviewer’s comments.

      Reviewer #3 (Public review):

      Summary:

      The manuscript "Unreliable homeostatic action potential broadening in cultured dissociated neurons" by Ritzau-Jost et al. investigates action potential (AP) broadening as a mechanism underlying homeostatic synaptic plasticity. Given the existing variability in the literature concerning AP broadening, the authors address an important and timely research question of considerable interest to the field.

      The study systematically demonstrates cell-type- and model-specific AP broadening in hippocampal neurons after chronic treatment with either tetrodotoxin (TTX) or glutamatergic transmission blockers. The findings indicate AP broadening in CA3 pyramidal neurons in organotypic cultures after TTX treatment, but notably not in dissociated hippocampal neurons under identical conditions. However, blocking glutamatergic neurotransmission caused AP broadening in dissociated hippocampal neurons. Moreover, extensive evaluations in neocortical dissociated cultures robustly challenge previous findings by revealing a lack of AP broadening following TTX treatment. Additionally, the proposed role of BK-type potassium channels in mediating AP broadening is convincingly questioned through complementary electrophysiological and voltage-imaging experiments.

      Strengths:

      The manuscript exhibits an outstanding experimental design, employing state-of-the-art techniques and a rigorous multi-lab validation approach that greatly enhances scientific reliability. The experimental results are meticulously illustrated, and the conclusions drawn are justified and supported by the presented data. Furthermore, the manuscript is comprehensively and clearly written.

      Weaknesses:

      Concerning the statistical analyses employed, it is advisable to consider the Kruskal-Wallis test with corrections for multiple comparisons when evaluating more than two experimental groups.

      We would like to thank the reviewer for the positive evaluation of our manuscript. In the following we first address the comment regarding the used statistical tests. Please also find below the detailed response to the reviewer’s further comments. Indeed, we did not apply a correction for multiple comparisons in Figure 2. This seems justified because in this exceptional case we are more worried about type II errors (false negative). The Kruskal-Wallis test seems not appropriate for this type of data for which only the comparison between the control and respective TTX data is relevant. Instead, we followed the reviewer’s suggestion by applying corrections for false discovery rate (FDR). We thank the reviewer for pointing out this statistical issue and addressed it in the revised manuscript (lines 121–128):

      “Even though AP durations varied up to 2-fold between conditions, statistically significant homeostatic AP broadening was not detectable in any of the tested conditions (Fig. 2B). To minimize type II errors (false negative) we intentionally did not apply a correction for multiple comparisons. The only significance was observed in condition III but in an opposite direction (i.e. AP narrowing with TTX, P=0.026; Fig. 2B). However, this is likely a false positive because application of corrections for false discovery rate results in P=0.268 for both Benjamini–Hochberg and Bonferroni correction.”

      Recommendations for the authors:

      Reviewing Editor Comments:

      The main and most important observation of the study is that the AP does not change in most cases examined. A discussion of the mechanisms of the changes in CA3 neurons would significantly strengthen the compelling evidence presented. The individual reviews are also provided, in case the authors find them useful to include other aspects suggested by the reviewers.

      We would like to thank the Reviewing Editor for handing our manuscript and for the positive evaluation of our work. The main focus of our study was the analysis of homeostatic plasticity in cultured neurons of the neocortex. We agree that the findings in CA3 neurons are interesting. As explained in more detail below, we have carefully discussed the mechanisms of the changes in CA3 neurons in the revised manuscript.

      Reviewer #1 (Recommendations for the authors):

      Major points

      (1) AP widths measured in the present study under basal conditions are generally larger than the value reported in previous work by Li et al. 2020 (~1.5 ms). In particular, rat cortical cultures prepared using the same conditions show that the mean AP half-width in controls of the present study (~2.5 ms) is closer to the mean AP half-width in TTX-treated neurons in Li et al. (~2.0 ms).

      We thank the reviewer for the detailed and positive feedback as well as for the thoughtful questions. The inconsistency of action potential half-duration reported in our and Li et al.’s data is partially due to differences in the way the half-duration was measured. In Li et al. the exact method is unfortunately not defined, but from a personal communication with the authors we know that they measured half-duration based on the AP amplitude between AP peak and AP voltage threshold. In contrast, we measured half-duration based on the AP amplitude between AP peak and the resting membrane potential preceding current injections. When we measure AP half-duration instead from voltage threshold, the average half-durations are 1.97 ms (compared to 2.64 ms from baseline, n = 106 cells; average across conditions I–IV, control and TTX merged). Thus, the discrepancy in the half-duration is to a significant proportion due to methodical differences in the way the half-duration was measured.

      One parameter that is not stated in either study is cell plating density, which can potentially bias the neuronal network activity levels of cultures. Could the authors comment on the possible contribution of neuronal culture density to AP half-width under basal recording conditions and its sensitivity to chronic TTX treatment? Are there any data available? For example, cultures used by Li et al may have been plated at a high density and experienced high activity level during culturing, which could have contributed to the enhanced sensitivity to chronic activity suppression by TTX.

      We agree that neuronal culture density is an important factor influencing neuronal activity and hence potentially also the sensitivity to chronic activity suppression. In our experiments, the number of plated cells per cover slip varied between conditions about 3-fold: 30–50k cells for conditions I and II, 25–30k cells for conditions III, VII, XI, 50k cells for condition IV, 65k for conditions V, VI and VIII, and 70k cells for conditions IX and X. Li et al. do not provide the cell density or the number of plated cells. Despite the difference in the number of plated cells in our dataset across various laboratories, we did not observe a systematic effect of cell number on baseline AP half-duration. Furthermore, we observed strongly different baseline activity across our various experimental conditions (Fig. 3A), which did not correlate with cell density. Also, we did not notice an impact of baseline activity on the sensitivity to chronic activity suppression with TTX (cf. Fig. 3A and 2B). We have now added the number of plated cells per condition to the methods section as well as the following paragraph to the discussion section (lines 256–262):

      “The sensitivity to chronic TTX treatment might depend on baseline neuronal activity, which is in part related to neuronal culture density[37]. However, TTX did not induce AP broadening despite different baseline activities (Fig. 3A) and a nearly threefold variation in the number of plated cells per cover slip between conditions (25k – 70k cells per coverslip).”

      In addition, a discussion of the reasons for the seeming stability of AP half-width to sodium channel modulation might help extend the scope of the study beyond the presentation of a negative conclusion.

      We thank the reviewer for this suggestion and have added a paragraph to the end of the discussion emphasizing potential advantages of cell-type specific AP broadening (lines 353–362):

      “Despite the lack of homeostatic, TTX-induced AP broadening in dissociated cultures, AP duration was broadened upon Kyn-treatment in dissociated cultures and using TTX in CA3 neurons in organotypic cultures. Because BK-channels control AP duration in CA3 neurons of organotypic cultures[79], homeostatic BK-channel downregulation as proposed by Li et al. may be involved in AP broadening in this specific cell type. While the reasons for the variable occurrence of homeostatic AP broadening remain unknown, this may render neuronal circuitries more robust to perturbations. The regulation of AP duration therefore might represent one element in the repertoire of neuronal plasticity that is, similar to other plasticity mechanisms, not generally shared, but specifically expressed in some cell types and neuronal compartments.”

      (2) In this study, CA3 neurons in organotypic cultures were the only cells that showed AP broadening with TTX treatment. Notably, CA3 neurons show strong recurrent activity in general and would be expected to have experienced high levels of activity in culture. For CA3 neurons in organotypic cultures, does IbTx increase basal AP half-width?

      We thank the reviewer for this interesting idea. Even though, to our knowledge, there is no study investigating the effect of IbTx on AP width in CA3 neurons of organotypic cultures, Raffaelli et al. (DOI 10.1113/jphysiol.2004.062661) reported ~15% AP broadening using the BK-channel blocker paxilline. Therefore, TTX-induced broadening in CA3 neurons might be related to BK-channel-dependent AP repolarisation, consistent with the model proposed by Li et al. Because organotypic cultures show increased activity for longer cultivation periods and higher connectivity compared to acute slices (De Simoni et al., DOI 10.1113/jphysiol.2003.039099), the effect of TTX may be aggravated in organotypic cultures compared to acute slices or in vivo. However, the lack of a TTX-effect was not dependent on background neuronal activity or culture density in our recordings (see above as well as lines 306–310 of the revised manuscript).

      (3) Figures 4E-G. In experiments to test the efficacy of IbTx with GEVI, larger fields of view of neuron(s) used for recordings should be included. As shown, it is difficult to discern the quality of the preparation and does not provide a representative indication of the type of signals measured.

      We thank the reviewer for this suggestion and have included an image of a representative neuron expressing the GEVI in Fig. 4E.

      Minor points

      (1) Lines 222-228. With respect to cell-type specificity of TTX-induced AP broadening, the observed lack of effect of TTX in dissociated hippocampal cultures might suggest that the cultures are predominantly DG granule cells and CA1 neurons, with few CA3 neurons surviving. Could the authors comment?

      We thank the review for this interesting hypothesis and have discussed it in the manuscript as a potential explanation for our different findings in the hippocampus.(lines 263–270):

      “Although we mainly focus on neocortical cultured neurons (condition I to VIII, Fig. 2) because Li et al. used neocortical neurons, the absence of AP broadening in hippocampal neurons (group IX to XI) could in principle be explained by the selective loss of CA3 neurons, which show AP broadening in organotypic cultured neurons (Fig. 1A and B). However, CA3 neurons were shown to survive in dissociated cultures following region-specific microdissection[40], and CA1 neurons are generally more stress-sensitive to excitotoxicity with glutamate or NMDA than CA3 and DG neurons[42], arguing against a general selective loss of CA3 neuron in dissociated cultures.”

      (2) Figures 3D, E. To what extent is the observed increase in sEPSC amplitude due to an increase in sEPSC frequency? Is quantal amplitude increased following TTX treatment, a postsynaptic strength parameter that one would not expect to be affected by a change in AP width, but that is known to undergo up-scaling with chronic TTX treatment?

      We would like to thank the reviewer for the question. We cannot rule out an interplay between sEPSC amplitude and frequency. We did not measure quantal amplitude in the presence of TTX. Our experiments were designed to test whether TTX successfully induced homeostatic plasticity, but not to attribute the observed effect to pre- and postsynaptic mechanisms. We have added the following statement to the revised manuscript, to highlight the possible interaction of sEPSC amplitude and frequency (lines 176–178):

      “These changes in sEPSC amplitude and frequency are not specific for somatic, pre- or postsynaptic adaptations. However, the results show that blocking AP firing with TTX successfully induced homeostatic plasticity under our experimental conditions.”

      (3) Line 132. Could the authors explain the rationale for using AP amplitude as a measure of neuronal "viability"?

      In a response to Cell, Li et al. suggested that the lack of a TTX effect was due to recordings from unhealthy neurons and that small AP amplitudes could indicate impaired cell viability. Indeed, we also believe that cells which appear morphologically less healthy tend to have small and slow APs. A mechanistic rationale could be a change resting membrane potential or changes in the expression of voltage-gated sodium and potassium channels. However, AP amplitudes were not affected following TTX treatment in any of the eleven recording conditions (Fig. 2D) or a cross-conditional comparison (Fig. 2E). In the revised manuscript, we have now added a possible rationale (lines 134–137):

      “Because unhealthy neurons tend to have small and slow APs, possibly due to changes in resting membrane potential or expression of voltage-gated sodium and potassium channels, we first analyzed AP amplitude as a measure of neuronal viability.”

      Reviewer #3 (Recommendations for the authors):

      I propose addressing the following questions, either through additional experiments (recommended) or a deeper theoretical discussion:

      (1) Since the authors demonstrate that blocking glutamatergic neurotransmission in dissociated hippocampal neurons causes AP broadening, do similar phenomena occur in organotypic cultures and dissociated neocortical neurons?

      We thank the reviewer for the interesting question. In dissociated hippocampal cultures, we show that AP duration is maintained following treatment with TTX and NBXQ, while Kyn-treatment leads to AP broadening (Figure 1C). To our knowledge, the effect of Kyn on AP duration has not been studied in neocortical dissociated cultured neurons. However, Kyn induced AP broadening in CA3 neurons of hippocampal organotypic cultures (Zbili et al., DOI 10.1073/pnas.2110601118) while CNQX did not induce such broadening in CA1 neurons (Karmarkar and Buonomano, DOI 10.1111/j.1460-9568.2006.04692.x). Both findings are in accord with our recordings from dissociated hippocampal cultures. These data however do not allow inference as to whether AP broadening is a cell-type specific or blocker-specific mechanism in hippocampal organotypic cultures. Because the main focus of our study is the absence of AP broadening in neocortical cultured neurons as described by Li et al., we adjusted the corresponding discussion section (lines 299–322)

      “In contrast, APs were not significantly broader following synaptic block by NBQX (Fig. 1C, D), in accord with recordings from CA1 neurons in organotypic cultures using CNQX. TTX-induced broadening may therefore be cell-type specific or due to a differential effect of the glutamate receptor blockers on NMDA receptors which are blocked by Kyn but not NBQX/CNQX or TTX and which have recently been demonstrated to be important for the induction of synaptic homeostatic plasticity[41].”

      (2) Are BK channels involved in AP broadening observed in CA3 pyramidal neurons in organotypic cultures?

      We thank the reviewer for the question. BK channels control spike duration in CA3 neurons of organotypic cultures (~15% broadening upon block by paxilline; Raffaelli et al., DOI 10.1113/jphysiol.2004.062661). Even though there is no available data on the contribution of BK channels to homeostatic spike broadening in this cell type, CA3 neurons in organotypic cultures thereby fulfil the two necessary preconditions of the model proposed by Li et al. (namely, the control of the resting AP width by BK-channels and TTX-induced AP broadening). We include this possibility in the discussion (lines 355–357):

      “Because BK-channels control AP duration in CA3 neurons of organotypic cultures[79], homeostatic BK-channel downregulation as proposed by Li et al. may be involved in AP broadening in this specific cell type.”

      (3) AP broadening consistently occurs in CA3 neurons within organotypic cultures; what molecular or cellular mechanisms underpin this phenomenon, and is there a potential contribution from glial cells?

      We thank the reviewer for this interesting question. CA3 neurons show AP broadening upon chronic inactivity across various studies that has not been observed in CA1 or DG neurons. Recordings from CA3 neurons served as a positive example for TTX-induced AP broadening in our study, in contrast to a lack of broadening in dissociated (neocortical and hippocampal) cultured neurons. The discrepancy between the results in dissociated and organotypic cultured neurons could indeed be due to interactions with glia cells. We have added this possibility to the discussion in the revised version of the manuscript (lines 270–273)

      “Altered cell-cell interactions with glia and neurons in organotypic and dissociated neuronal cultures could instead contribute to the different findings in various hippocampal preparations.”

    1. eLife Assessment

      This valuable study demonstrates that the E3 ligase ITCH regulates several steps of the SARS-CoV-2 replication cycle by enhancing ubiquitination of viral envelope and membrane proteins. The phenotypic data are based on solid evidence showing a role for ITCH in distinct phases of viral replication and host processes. The findings lay the ground work for future studies to decipher detailed molecular mechanisms that explain how ITCH regulates SARS-CoV-2.

    2. Reviewer #1 (Public review):

      Summary:

      The authors investigated the role of an E3 ubiquitin ligase ITCH in regulating the viral life cycle of SARS-CoV-2. The authors showed that ITCH mediates ubiquitination of the membrane (M) and envelope (E) proteins of SARS-CoV-2. Ubiquitination of E and M result in enhanced interactions between the structural proteins and redistribution of the structural proteins into autophagosomes. The authors claim that the enhanced interactions between structural proteins and trafficking of the structural proteins into autophagosomes contribute to SARS-CoV-2 replication and egress, prompting ITCH as a potential antiviral target. ITCH also alters the cellular distribution of host proteases important for spike cleavage which protect and stabilize spike with cleavage. The authors also demonstrated that SARS-CoV-2 replication is augmented by ITCH in which virus replication is significantly impaired in cells lacking ITCH expression.

      Strengths:

      The authors provided high quality data with appropriate experimental controls to justify their claims and conclusions. The mechanistic analyses are excellent and presented in a logical manner. The investigation of the role of ubiquitination in coronavirus assembly and egress is novel as most previous studies focused on its role in mediating innate immune responses.

      Comments on revisions:

      The authors have addressed my previous concerns.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The authors investigated the role of an E3 ubiquitin ligase ITCH in regulating the viral life cycle of SARS-CoV-2. The authors showed that ITCH mediates ubiquitination of the membrane (M) and envelope (E) proteins of SARS-CoV-2. Ubiquitination of E and M results in enhanced interactions between the structural proteins and redistribution of the structural proteins into autophagosomes. The authors claim that the enhanced interactions between structural proteins and trafficking of the structural proteins into autophagosomes contribute to SARS-CoV-2 replication and egress, prompting ITCH as a potential antiviral target. ITCH also alters the cellular distribution of host proteases important for spike cleavage which protect and stabilize spike with cleavage. The authors also demonstrated that SARS-CoV-2 replication is augmented by ITCH in which virus replication is significantly impaired in cells lacking ITCH expression.

      Strengths:

      The authors provided high-quality data with appropriate experimental controls to justify their claims and conclusions. The mechanistic analyses are excellent and presented in a logical manner. The investigation of the role of ubiquitination in coronavirus assembly and egress is novel as most previous studies focused on its role in mediating innate immune responses.

      Weaknesses:

      Although the authors showed that ITCH ubiquitinates E and M proteins, the claim that such ubiquitination promotes virion assembly and egress is circumstantial. The enhanced interaction between the structural proteins and targeting of ubiquitinated structural proteins into autophagosomes does not necessarily result in increased virion production and release as suggested by the authors. There is a disconnect between the ubiquitination of structural proteins and the role of ITCH in augmenting virus replication as shown in Fig. 6A and B. In addition, the authors showed that the catalytic activity of ITCH is important for the localization and maturation of host proteases. However, the mechanism behind is unknown. Also, it is unclear how protection of spike from cleavage conferred by ITCH explains its role in promoting replication as a lack of spike cleavage would inevitably compromise entry. The major weakness of the manuscript is the lack of experimental data that explains the molecular role of ITCH in relation to its phenotype observed during SARS-CoV-2 infection.

      We sincerely thank the reviewer for the positive evaluation of the quality, rigor, and novelty of our study. We particularly appreciate the thoughtful comments regarding the mechanistic link between ITCH-mediated ubiquitination and viral assembly/egress, as well as the broader implications for SARS-CoV-2 replication.

      Our data support a model in which ITCH-mediated ubiquitination of the structural proteins M and E enhances their interactions and promotes their trafficking into autophagosomal compartments, ultimately contributing to increased virion production and release. The phenotypic outcomes observed in Fig. 6A-B (replaced by re-measured viral infectious titer and genomic copy number in the culture medium of vT2-WT and vT2-KO cells) are consistent with our earlier findings in Figs. 1-5, which demonstrate that ITCH promotes SARS-CoV-2 replication. Thus, the replication defect observed in ITCH-deficient cells aligns with the mechanistic effects of ITCH on structural protein ubiquitination and trafficking.

      We agree with the reviewer that directly linking ubiquitination of structural proteins to virion production would further strengthen the mechanistic connection. However, direct detection of ubiquitinated virions in vitro, particularly by electron microscopy (EM), remains technically challenging. Our laboratory has not yet established an EM-based platform optimized for high-resolution SARS-CoV-2 virion analysis. Furthermore, it is possible that ubiquitin chains conjugated to structural proteins are cleaved during or after virion egress, which would complicate their detection in released particles. These technical and biological considerations currently limit direct visualization of ubiquitinated virions.

      Regarding the role of ITCH in regulating the localization and maturation of host proteases, our recent studies [1, 2] have demonstrated that ITCH is involved in Golgi fragmentation, leading to altered furin distribution and impaired cathepsin L maturation. These findings provide mechanistic insight into how ITCH catalytic activity may influence host protease processing. We have incorporated this discussion into the revised manuscript (last paragraph of the Discussion section) to better contextualize our observations.

      With respect to spike cleavage, although S1/S2 processing is required for SARS-CoV-2 entry, accumulating evidence suggests that excessive intracellular cleavage may be detrimental to virion stability. For example, in Vero cells lacking TMPRSS2, virions containing cleaved S1 and S2 are less stable [3]. Additionally, the D614G substitution renders the spike protein more resistant to cleavage, reduces S1 shedding, and enhances incorporation of intact spike into virions, thereby increasing infectivity and stability [4-6]. These findings suggest that maintaining intact spike during intracellular assembly may be advantageous for the viral life cycle. In this context, ITCH-mediated modulation of host protease distribution and spike processing may help preserve spike integrity within assembling virions.

      Taken together, the ability of ITCH to (i) enhance structural protein interactions, (ii) facilitate trafficking through autophagosomal pathways, and (iii) promote incorporation of intact spike into virions provides a coherent mechanistic framework explaining how ITCH enhances virion production and release. While additional studies will be required to further dissect the precise molecular details, our data collectively support a functional link between ITCH ubiquitin ligase activity and SARS-CoV-2 assembly and egress.

      Reviewer #2 (Public review):

      Summary:

      In this manuscript Qiwang Xiang et al. investigated the role of the E3 ubiquitin ligase ITCH in the life cycle of SARS-CoV-2. They claim the following:

      (i) ITCH promotes virion assembly by interacting with E and M proteins and enhancing their K63-linked ubiquitination

      (ii) ITCH-mediated ubiquitination promotes autophagosome-dependent secretion of viral particles.

      (iii) ITCH stabilizes the viral spike protein by impairing its processing by furin and catepsin L proteases.

      The manuscript provides an interesting exploration of ITCH's role in the SARS-CoV-2 life cycle but requires additional work to strengthen key claims and address potential confounding factors.

      Strengths:

      The experiments are sufficiently clear in documenting that ITCH activity is critical for efficient SARS-CoV-2 replication and for M and E proteins K63-linked ubiquitination

      Weaknesses:

      The manuscript does not convincingly demonstrate how ITCH-mediated ubiquitination of E and M impacts virus assembly and release. Identifying the specific lysine residues in M and E targeted by ITCH, and generating mutant VLPs or recombinant viruses, would strengthen the conclusions.

      Most of the conclusions rely on ITCH overexpression data, which may have off-target effects on Golgi integrity and vesicular trafficking. For instance, figure 4F provides evidence of altered Golgi morphology and TGN46 fragmentation raising concerns that ITCH overexpression could indirectly mislocalize furin, affecting S1/S2 cleavage of the spike protein. In addition, inhibition of furin activity may also lead to off-target effects, given its role in processing numerous host proteins.

      Similarly, ITCH overexpression is likely to indirectly affect cathepsin-L maturation. In addition, the manuscript does not clarify how impaired cathepsin L activity would influence virus assembly or release.

      A major concern is also the lack of quantification and statistical analysis of immunofluorescence images throughout the manuscript, which undermines the reliability of these observations.

      We sincerely thank the reviewer for recognizing the importance of ITCH in SARS-CoV-2 replication and for the constructive and insightful suggestions to further strengthen the manuscript.

      Regarding the impact of ITCH-mediated ubiquitination of E and M on virus assembly and release, our data support a model in which ITCH promotes K63-linked ubiquitination of the E and M proteins, facilitating their recruitment to p62-positive autophagosomal compartments. This recruitment likely enhances the spatial proximity and interaction frequency of structural proteins within assembly sites, thereby promoting efficient virion assembly and subsequent release via autophagosome-dependent secretory pathways.

      We agree that identifying the specific lysine residues in M and E targeted by ITCH and generating mutant VLPs or recombinant viruses would provide a more direct mechanistic link. These are important and technically demanding experiments that require extensive mutagenesis and reverse genetics approaches. While beyond the scope of the current study, we fully acknowledge their value and plan to pursue these directions in future work to further refine the mechanistic understanding of ITCH-dependent ubiquitination during coronavirus assembly.

      Regarding the reliance on ITCH overexpression systems, we acknowledge the reviewer’s concern that ectopic ITCH expression may affect Golgi integrity and vesicular trafficking. Indeed, our recent studies [1, 2] demonstrate that ITCH catalytic activity disrupts Golgi structure, resulting in altered furin distribution and impaired cathepsin L maturation. These findings provide mechanistic context for the phenotypes observed in the present study and suggest that ITCH regulates host protease localization through defined cellular pathways rather than nonspecific overexpression artifacts. We have now expanded the Discussion section (last paragraph) to clarify this mechanistic framework.

      Importantly, SARS-CoV-2 infection itself significantly activates endogenous ITCH, and therefore our ectopic expression system likely mimics infection-induced ITCH activation rather than representing a purely artificial condition. In addition, key phenotypes, such as reduced viral replication and altered structural protein behavior, are consistently observed in ITCH-deficient cells, supporting the physiological relevance of ITCH activity in the viral life cycle.

      Regarding cathepsin L (CTSL) maturation, we have expanded the Discussion to clarify how impaired CTSL activity may influence viral assembly and egress. ITCH inhibits CTSL maturation, thereby reducing excessive spike cleavage into smaller fragments. Although CTSL-mediated spike processing facilitates genome release following endocytosis [7, 8], CTSL is a lysosomal protease, and lysosomes are exploited by β-coronaviruses as egress organelles [9]. Excessive lysosomal proteolysis may therefore compromise virion integrity during egress. In this context, ITCH-mediated inhibition of CTSL maturation may preserve spike stability within assembling or trafficking virions, thereby promoting the production and release of infectious particles during the replication phase.

      Regarding quantification and statistical analysis of immunofluorescence data, we appreciate this important point. In the revised manuscript, we have included expanded image panels with increased cell numbers, quantitative colocalization analyses to enhance the rigor of these observations.

      Reviewer #3 (Public review):

      Summary:

      Xiang et al. investigated the role of ubiquitin E3 ligase ITCH in SARS-CoV-2 replication. First, they described the role of ITCH on the structural proteins. Here, the ubiquitination of E and M (but not S) leads to an enhanced interaction and presumably virion assembly. In addition, E and M ubiquitination seems to be necessary for p62-guided sequestration into autophagosomes for secretion. Furthermore, ITCH regulates S proteolytic cleavage by changing furin localization and inhibiting CTSL protease maturation. In addition, SARS-CoV-2 infection upregulates ITCH phosphorylation, whereas knockout of ITCH reduces SARS-CoV-2 replication.

      Strengths:

      The proposed study is of interest to the virology community because it aims to elucidate the role of ubiquitination by ITCH in SARS-CoV-2 proteins. Understanding these mechanisms will address broadly applicable questions about coronavirus biology and enhance our knowledge of ubiquitination's diverse functions in cell biology.

      Weakness:

      The involvement of ubiquitin ligases in SARS-CoV-2 replication is not entirely new (see E3 Ubiquitin Ligase RNF5; Yuan et al., 2022; Li et al., 2023). While the data generally support the conclusions, additional work is needed to confirm the role of ITCH in SARS-CoV-2 replication in a biologically relevant context. The vast majority of data is based on transient overexpression experiments of ITCH, which ultimately leads to massive ubiquitination of several viral and host cell factors, including potentially low-affinity substrates not typically recognized under physiological conditions. In addition to that, nearly all experiments were done in cells co-overexpressing ITCH and the viral structural proteins (or cellular proteases) in HEK293T cells. Therefore, a proteomic analysis of protein ubiquitination in a) SARS-CoV-2-infected cells (ideally several cell types) and b) SARS-CoV-2-infected v2T-ITCH-KO cells would verify the ITCH-related ubiquitination of e.g., E and M and would strengthen the whole manuscript. In addition, the few key experiments using SARS-CoV-2 infected cells were performed in VeroE6 cells, which are neither human nor lung-derived. Only in one experiment were lung-derived Calu3 cells included.

      Moreover, the manuscript names ITCH as a central regulator of SARS-CoV-2 replication. If ITCH is beneficial for E and M interaction and thereby aids virion assembly, showing its effect on VLP production would be desirable. Clarifications regarding data acquisition and data analysis could strengthen the manuscript and its conclusions.

      We sincerely thank the reviewer for the thoughtful evaluation and for highlighting the importance of demonstrating physiological relevance.

      We agree that the involvement of E3 ubiquitin ligases in SARS-CoV-2 replication is not entirely unprecedented. Accordingly, we have expanded the Introduction to discuss RNF5 and other E3 ligases previously implicated in SARS-CoV-2 biology (e.g., Yuan et al., 2022; Li et al., 2023), thereby clarifying how ITCH differs mechanistically.

      Regarding the reliance on transient overexpression systems, we acknowledge the reviewer’s concern. Importantly, SARS-CoV-2 infection itself significantly induces ITCH phosphorylation and activation. Therefore, our ectopic expression system likely mimics infection-driven ITCH activation rather than representing a purely artificial condition. Moreover, key findings, including reduced viral replication and diminished E/M ubiquitination, were validated in ITCH knockout cells, supporting the physiological relevance of ITCH-dependent structural protein ubiquitination under endogenous conditions.

      We appreciate the suggestion to perform a global proteomic analysis of ubiquitinated proteins in (i) SARS-CoV-2-infected cells and (ii) SARS-CoV-2-infected ITCH-KO cells. Such analyses would indeed provide a comprehensive and unbiased assessment of ITCH-dependent ubiquitination events. While this approach is beyond the scope of the current study, we fully recognize its value and plan to pursue it in future investigations to further refine the mechanistic understanding of ITCH-mediated ubiquitination during coronavirus assembly.

      With respect to the cellular models used, Vero E6/TMPRSS2 cells are widely established for SARS-CoV-2 propagation due to their robust viral replication, rapid growth, and reduced culture-adapted mutations. Compared with Calu-3 cells, which grow more slowly and may acquire specific adaptations in certain viral genes during prolonged passage, Vero E6/TMPRSS2 cells maintain high viral stability and reproducibility, making them suitable for mechanistic studies. Nevertheless, we agree that human lung-derived systems are highly relevant, and we have included Calu-3 cell data where feasible to support translational relevance.

      Regarding the role of ITCH in virion assembly, our data in Fig. 2 demonstrate that ITCH-mediated K63-linked ubiquitination enhances the interaction between E and M proteins, supporting a functional role in virus-like particle (VLP) formation. We agree that direct visualization and quantification of VLP production by EM would further strengthen this conclusion. Such experiments require additional optimization and will be pursued in future work to provide more direct structural evidence.

      Finally, in response to the reviewer’s comments on data acquisition and analysis, we have expanded image panels, increased the number of quantified cells, and included quantitative colocalization analyses with appropriate statistical evaluation in the revised manuscript to enhance rigor and reproducibility.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) The authors should compare the infectivity of SARS-CoV-2 generated in cell lines expressing or lacking ITCH to investigate the effects of ITCH on infectivity, possibly by measuring RNA to PFU ratio and determining the S cleavage pattern in purified virions.

      We re-measured the viral infectious titer and genomic copy number in the culture medium of vT2-WT and vT2-KO cells infected at an MOI of 0.0001 for 24 h. ITCH ablation reduced the viral copy number by approximately 8-fold (Fig. 6B), while the infectious titer (TCID<sub>₅₀</sub>) decreased by at least 25-fold (Fig. 6A), indicating that loss of ITCH markedly impairs the formation of infectious viral particles. This finding is consistent with the role of ITCH in promoting Spike (S) protein cleavage.

      As suggested, to assess the S cleavage pattern in secreted virions, we precipitated proteins from the culture medium of SARS-CoV-2–infected cells with or without ITCH expression. Analysis of the precipitated S proteins revealed that the loss of ITCH markedly altered the integrity of full-length S in SARS-CoV-2 virions (Fig. S7A).

      (2) The authors should strengthen the connection between ubiquitination of structural proteins and viral egress by measuring infectious virus particles in the supernatants from cells with or without ITCH expression by plaque assay. However, this cannot be accurately achieved without performing the experiment described in point 1 as cleavage of spike and infectivity would affect the results.

      While a plaque assay was not performed, we quantified infectious viral particles in the supernatants using the TCID<sub>₅₀</sub> assay. These analyses showed that loss of ITCH resulted in a marked reduction in infectious virion production (>25-fold; Fig. 6A). In contrast, viral genomic copy numbers, which reflect both infectious and non-infectious particles, were reduced by approximately eightfold (Fig. 6B). The disproportionate reduction in infectious titer relative to viral copy number (approximately threefold difference) is consistent with a defect in virion infectivity, most likely due to impaired S cleavage in the absence of ITCH (Fig. S7A). The reduction in viral copy numbers suggests that ITCH-dependent ubiquitination of viral structural proteins contributes to efficient viral assembly and egress.

      (3) The authors should strengthen the connection between ubiquitination of structural proteins and virion assembly by EM.

      We appreciate the reviewer’s insightful comment. However, detecting ubiquitinated virions in vitro via electron microscopy (EM) remains technically challenging. At present, our laboratory has not yet established an EM-based system optimized for SARS-CoV-2 virion analysis. Moreover, it is also possible that ubiquitin chains present on virions may be cleaved during or after the viral egress process, further complicating their detection.

      Reviewer #2 (Recommendations for the authors):

      Supp. Figure 2: the authors should provide sequencing data for both ITCH-KO clones for consistency.

      The sequence for both ITCH-KO clones have been included now (Fig. S2C).

      Figure 2: All interaction data between structural proteins and p62 rely on ITCH overexpression. It would be helpful to include data in ITCH-KO cells as controls to validate these findings.

      As suggested, we performed E-based immunoprecipitation in wild-type (WT) and ITCH-knockout (KO) cells and found that E pulled down less p62 in the absence of ITCH, confirming that ITCH-mediated ubiquitination of E facilitates its interaction with p62 (Fig. 3C).

      Figure 3H: Verify the middle LC3B panel, as it does not match the merge panel. Please, correct any discrepancies.

      We thank the reviewer for pointing out this error. Fig. 3H (now Fig. 3J) has been corrected accordingly.

      Figure 4F: the labeling of the different panels seems incorrect.

      We have corrected the figure labeling.

      The authors should perform cell viability assays in clomipramine-treated cells. In addition, the authors should clarify whether clomipramine's antiviral effects depend on ITCH expression, given the comparable virus copy numbers in treated WT (Fig. S7B) and ITCH-KO cells (Fig. S7C)

      We thank the reviewer for this helpful comment. As shown Author response image 1., while clomipramine (Clom) treatment for 48 hours resulted in a modest reduction in cell number compared with the DMSO control, no apparent cell death was detected under these conditions.

      Author response image 1.

      Vero-TMPRSS2 (A) or Vero-ITCH-KO (B) cells were treated with DMSO or chloroquine (Clo) for 48 h, and cell viability was assessed by calcein AM staining (n = 3).

      Reviewer #3 (Recommendations for the authors):

      Results:

      Fig.2A and 2E display controversial results with different outcomes depending on the used bait. In my opinion, in both approaches, the overexpressed ITCH should be able to ubiquitinate M and E (since they are co-expressed). However, the interaction of E and M is not affected by the overexpression of ITCH or ITCH-CS when E is used as a bait (Fig.2A). In contrast, the interaction of E and M is enhanced in the presence of overexpressed ITCH (Fig.2E), when M is used as a bait.

      We thank the reviewer for pointing this out. It should be noted that the blots display only the major (un-ubiquitinated) bands of E and M. When M was used as the bait, more E (main band, un-ubiquitinated form) was co-precipitated in the presence of ectopically expressed ITCH. In contrast, when E was used as the bait, comparable levels of M (main band, un-ubiquitinated form) were detected regardless of ITCH expression. These results suggest that ubiquitin-modified M can bind more E, whereas ubiquitin-modified E does not significantly affect its interaction with M. A more detailed explanation has been added to the revised text.

      Fig.3A+3F: The authors claim a reduced E secretion when ITCH-KO cells or shRNA-treated p62 cells are used. I believe an input loading control of the supernatant displaying an equal amount of e.g. BSA is missing.

      In response to the reviewer’s suggestion, we have now included Coomassie Brilliant Blue (CBB) staining of the culture medium (now shown in Fig. 3A and Fig. 3F).

      Fig.3B: ITCH does not interact with E (or M) alone in the displayed data. The data is comparable with data observed for the interaction with S (Supp.4A). However, the author claims that ITCH interacts with M and E but not S (page 11).

      We would like to clarify that in ECL-based Western blotting, strong signals can mask weaker ones due to contrast limitations. In this experiment, ectopic expression of ITCH produced a strong signal that obscured the endogenous ITCH band. Upon longer exposure, the endogenous ITCH signal becomes visible. Additionally, our data presented in Fig. 1 and the new data in Fig. 3C demonstrate the interaction between the relevant proteins.

      Fig 3F: A scrambled control is missing. Moreover, it would be desirable to see if overexpression of p62 would enhance E release to verify that ITCH ubiquitination and p62-positive autophagosomes are necessary for E release.

      We appreciate the reviewer’s comment. Proteins in the culture medium were precipitated using TCA, and Coomassie Brilliant Blue (CBB) staining has been included (now shown in Fig. 3F). Additionally, E release was examined in the presence of overexpressed p62, and the results showed that p62 overexpression increased the level of E detected in the medium (now shown in Fig. 3G).

      Fig.3: Overall, an experiment using, e.g. cycloheximide (protein synthesis inhibitor) and MG132 (proteasome inhibitor) would strengthen the hypothesis that E and M are not degraded in a lysosome after ITCH overexpression. In my opinion, a colocalization experiment with LAMP1 is unsuitable to draw this conclusion. Would the overexpression of a deubiquitinating enzyme diminish M, E and p62 interaction? Does ITCH/p62 only regulate the release of the overexpressed single E or M protein, or does it also affect VLP release? An experiment analyzing purified VLPs produced in ITCH- or ITCH-CS overexpressing cells would be desirable.

      We thank the reviewer for these important questions. As suggested, we performed additional CHX and MG132 experiments. As shown in Fig. 3H and Fig. S3I, degradation of both E and M proteins was blocked by MG132 treatment, indicating that they are degraded via the proteasome pathway. Notably, MG132 treatment did not rescue the ITCH-mediated decrease of E/M levels, suggesting that the ITCH-dependent reduction of E and M is not mediated through the proteasome pathway. In addition, our recent back-to-back studies [1, 2] demonstrated that ITCH overexpression inhibits lysosomal function by impairing hydrolase maturation, suggesting that ITCH-mediated ubiquitination of E or M is unlikely to promote their degradation through the lysosomal pathway. Together, these data suggest that ITCH-mediated reduction of E and M is not due to enhanced degradation but is instead associated with their secretion.

      Overexpression of deubiquitinating enzymes specifically targeting E or M (which remains to be identified) would likely reduce their interaction with p62.

      Our data indicate that ITCH-mediated ubiquitination of E and M enhances their mutual interaction, supporting a role for this process in virus-like particle (VLP) formation. P62 would facilitate the release of VLPs by promoting the secretion of ubiquitinated E and M. In addition, the data presented in Fig. 2 indicate that ITCH enhances the mutual interaction of these structural proteins, thereby promoting virus-like particle (VLP) formation.

      Fig.4A: PPC site mutation indicated in yellow. There is no yellow color.

      We have revised the label to read “PPC site mutation indicated in red and green”.

      Fig.4C: Why should the overexpression of ITCH or ITCH-CS affect the S protein cleavage when the cleavage site is anyhow mutated?

      In this analysis, we aimed to verify that neither ITCH nor ITCH-CS affects the cleavage pattern of the mutated S protein. As these data are already presented in Fig. 4D (now Fig. 4C), the redundant result has been removed, and the corresponding description has been added to the revised text.

      Fig.4C: Lysates from the single expression of S wt protein (-ITCH/ +ITCH-CS; as indicated in Fig.4B) is missing for comparison to S mut protein.

      As these controls and related data are already presented in Fig. 4D (now Fig. 4C), the redundant result here has been removed.

      Fig. 4D: Lane 5 and Lane 7 are labeled similarly. ITCH+ in Lane 5 needs to be removed.

      We thank the reviewer for pointing out this error. The labeling (now Fig. 4C) has been corrected.

      Fig 4G: A theoretical MOI of 1 does not lead to an infection of all cells. Therefore, including a third marker for infection control, e.g., N protein, would be helpful. This would clarify whether the changes in furin localization are due to infection.

      We appreciate the reviewer for raising this point. Our goal was to examine whether SARS-CoV-2 infection affects the localization of furin (mouse antibody) relative to the Golgi marker (rabbit antibody). As suitable E, N, or M antibodies raised in goat or donkey were not available, we could not include those markers in this experiment. However, we did confirm M protein expression in parallel, and the infection efficiency was higher than 80% (Author response image 2.). To further validate that the observed changes in furin localization were due to viral infection, we have now included additional images showing a larger field of view containing more cells .

      Author response image 2.

      Fig.4: Generally, the colocalization of proteases with TGN46 should be analyzed quantitatively using, for example, Madner's overlap coefficient. This would be needed to draw the conclusion stated in the manuscript.

      We appreciate the reviewer’s suggestion. We now have included the colocalization analysis in the Fig. 4E and F.

      Fig.4/5: Overview IF pictures displaying additional cells would be desirable to clarify furin/cathepsin L localization in ITCH/ITCH-CS expressing cells. Otherwise, it looks (in my opinion) very subjective.

      In response to the reviewer’s suggestion, we have included additional images with a larger field of view encompassing more cells for Fig. 4 and 5 (presented in Fig. S5B and S5H).

      Fig.5D/G: MOI is missing in the figure legend.

      As suggested, the MOI information has been added to the figure legend.

      Fig.5D/G/6C/F: Infection control (e.g., N-protein) is missing in the Western Blots.

      We have added the infection control M in the figures.

      Fig.6: Why is the overall amount of ITCH reduced during the course of infection?

      We appreciate the reviewer for raising this point. As shown in Fig. 6C and F, ITCH was significantly activated, as indicated by its phosphorylation at the T222 site during viral infection. This activation promotes ITCH self-ubiquitination.

      Fig.6A: Would an overexpression of ITCH enhance viral replication?

      Moderate upregulation of ITCH promotes viral replication, whereas excessive ITCH overexpression leads to cell death, which in turn partially reduces viral titers.

      Discussion:

      Is there an explanation of how ITCH changes furin localization and CSTL maturation?

      Our recent back-to-back studies[1, 2] demonstrated that ectopic ITCH expression disrupts Golgi integrity, resulting in altered furin distribution and impaired CSTL maturation. The relevant discussion has now been incorporated into the revised text (last paragraph of the Discussion section).

      It would also be helpful to discuss the role of other known ubiquitin ligases like RNF5 in the replication of SARS-CoV-2 and other CoVs. Since the pandemic began, many interactome and host-factor studies in various cell types have been published. None of these studies identified ITCH so far. Could you comment on this?

      As suggested, we have included additional known ubiquitin ligases involved in SARS-CoV-2 replication and in other viral systems (see the third paragraph of the Introduction).

      Overall, in my opinion, the figure legends need to be improved. It is often not clear if ITCH is endogenously detected or overexpressed.

      We thank the reviewer for the helpful suggestion. Additional details have been incorporated into the figure legends.

      (1) Xiang Q, Lu Y, Wang H, Chen H, Chen P, Zhao X, et al. ITCH regulates Golgi integrity and proteotoxicity in neurodegeneration. Science Advances 2025; 11:eado4330.

      (2) Xiang Q, Liu Y, Wang J. Golgi fragmentation driven by the USP11-ITCH axis triggers autolysosomal failure in neurodegeneration. Autophagy 2026.

      (3) Peacock TP, Goldhill DH, Zhou J, Baillon L, Frise R, Swann OC, et al. The furin cleavage site in the SARS-CoV-2 spike protein is required for transmission in ferrets. Nature microbiology 2021; 6:899-909.

      (4) Zhang L, Jackson CB, Mou H, Ojha A, Peng H, Quinlan BD, et al. SARS-CoV-2 spike-protein D614G mutation increases virion spike density and infectivity. Nature communications 2020; 11:1-9.

      (5) Plante JA, Liu Y, Liu J, Xia H, Johnson BA, Lokugamage KG, et al. Spike mutation D614G alters SARS-CoV-2 fitness. Nature 2021; 592:116-21.

      (6) Daniloski Z, Jordan TX, Ilmain JK, Guo X, Bhabha G, Sanjana NE. The Spike D614G mutation increases SARS-CoV-2 infection of multiple human cell types. Elife 2021; 10:e65365.

      (7) Jaimes JA, Millet JK, Whittaker GR. Proteolytic cleavage of the SARS-CoV-2 spike protein and the role of the novel S1/S2 site. IScience 2020; 23:101212.

      (8) Zhao M-M, Yang W-L, Yang F-Y, Zhang L, Huang W-J, Hou W, et al. Cathepsin L plays a key role in SARS-CoV-2 infection in humans and humanized mice and is a promising target for new drug development. Signal transduction and targeted therapy 2021; 6:1-12.

      (9) Ghosh S, Dellibovi-Ragheb TA, Kerviel A, Pak E, Qiu Q, Fisher M, et al. β-Coronaviruses use lysosomes for egress instead of the biosynthetic secretory pathway. Cell 2020; 183:1520-35. e14.

    1. eLife Assessment

      This study presents an important finding regarding how partner preference formation and pair bonding behavior are related to the oxytocin receptor gene expression in the NAc and paraventricular nucleus of the hypothalamus in prairie voles. The evidence supporting this claim is solid but could benefit from increased sample size and more thorough behavioral phenotyping. This study will be of interest to social scientists and neuroscientists who work on pair bonding and oxytocin.

    2. Reviewer #1 (Public review):

      Summary:

      In this remarkable study, the authors use some of their recently-developed oxytocin receptor knockout voles (Oxtr1-/- KOs) to re-examine how oxytocin might influence partner preference. They show that shorter cohabitation times leads to decreased huddling time and partner preference in the KO voles, but with longer periods preference is still established, i.e., the KO animals have a slower rate of forming preference, or are less sensitive to whatever cues or experiences lead to the formation of the pair bond as measured by this assay. This helps relate the authors recent study to the rest of the literature on oxytocin and partner preference in prairie voles. To better understand what might lead to slower partner preference, they quantified changes to the durations and frequency of huddling. In separate assays they also found that Oxtr1-/- KOs interacted more with stranger males than wild-type females. In a partner choice assay they found that wild-type males prefer wild-type females more than Oxtr1-/- KO females. They then performed bulk RNA-Seq profiling of nucleus accumbens of both wild-type and Oxtr1-/- KO males and females, either housed with animals of the same sex or paired with a wild-type of opposite sex. 13 differentially expressed genes were identified, mostly due to downregulation in wild-type females. These genes were also identified in a module lost in the Oxtr1-/- voles by correlated expression profiling. They also compared results of transcriptional profiling in female and male wild-type vs Oxtr1-/- voles (independently of bonding state), and found hundreds of differentially expressed genes in nucleus accumbens, mostly in females and often with some relation to neural development and/or autism. Some of the reduction in transcript was confirmed with in situs, as well as compared to changes in transcription in the lateral septum and paraventricular nucleus (PVN) of the hypothalamus. Finally they find fewer oxytocin+ and AVP+ neurons in the anterior PVN.

      Strengths:

      This is an important study helping to reveal the effects of oxytocin receptor knockout on behavior and gene expression. The experiments are thorough and reveal a surprising number of genetic and anatomical differences, with some sexual dimorphism as well, and the authors have more carefully examined the behavioral changes after shorter and longer periods of partner preference formation.

      Weaknesses:

      It is surprising that given all the genetic changes identified by the authors, that the behavioral phenotypes are fairly mild. The extent of gene changes also might be under-reported given the variability in the behavior and relative low number of animals profiled.

      Comments on revisions:

      No further recommendations. I commend the authors for finding the typos in their first version and correcting the manuscript.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      In this remarkable study, the authors use some of their recently-developed oxytocin receptor knockout voles (Oxtr1-/- KOs) to re-examine how oxytocin might influence partner preference. They show that shorter cohabitation times lead to decreased huddling time and partner preference in the KO voles, but with longer periods preference is still established, i.e., the KO animals have a slower rate of forming preference or are less sensitive to whatever cues or experiences lead to the formation of the pair bond as measured by this assay. This helps relate the authors' recent study to the rest of the literature on oxytocin and partner preference in prairie voles. To better understand what might lead to slower partner preference, they quantified changes to the durations and frequency of huddling. In separate assays, they also found that Oxtr1-/- KOs interacted more with stranger males than wild-type females. In a partner choice assay, they found that wild-type males prefer wild-type females more than Oxtr1-/- KO females. They then performed bulk RNA-Seq profiling of nucleus accumbens of both wild-type and Oxtr1-/- KO males and females, either housed with animals of the same sex or paired with a wild-type of the opposite sex. 13 differentially expressed genes were identified, mostly due to downregulation in wild-type females. These genes were also identified in a module lost in the Oxtr1-/- voles by correlated expression profiling. They also compared results of transcriptional profiling in female and male wild-type vs Oxtr1-/- voles (independently of bonding state) and found hundreds of differentially expressed genes in nucleus accumbens, mostly in females and often with some relation to neural development and/or autism. Some of the reduction in the transcript was confirmed with in-situs, as well as compared to changes in transcription in the lateral septum and paraventricular nucleus (PVN) of the hypothalamus. Finally, they find fewer oxytocin+ and AVP+ neurons in the anterior PVN.

      Strengths:

      This is an important study helping to reveal the effects of oxytocin receptor knockout on behavior and gene expression. The experiments are thorough and reveal a surprising number of genetic and anatomical differences, with some sexual dimorphism as well, and the authors have more carefully examined the behavioral changes after shorter and longer periods of partner preference formation.

      We thank Reviewer #1 for the positive assessment of the study’s significance and for recognizing the value of our behavioral and transcriptional analyses in refining the role of oxytocin signaling in pair bonding.

      Weaknesses:

      It is surprising that given all the genetic changes identified by the authors, the behavioral phenotypes are fairly mild. The extent of gene changes also might be underreported given the variability in the behavior and relatively low number of animals profiled.

      Pair bonding is a robust behavior composed of distinct modules that are supported by redundant and compensatory neural pathways. Our findings support a model in which Oxtr functions in parallel with other mechanisms to modulate specific components of social attachment. We have addressed this point in the discussion. We have also updated our result and method section to more clearly reflect our cohort size which is comparable to similar studies.

      Reviewer #1 (Recommendations for the authors):

      How do the wild-type males 'know' which animal is which during the three-chamber assay test of Figure 4B? Do the Oxtr1-/- KO females act in some way different from the wild types in this experiment?

      We thank the reviewer for this question. During follow-up analyses prompted by reviewer requests to characterize the behaviors underlying the apparent bias in WT male choice, we discovered a labeling error in the metadata used to analyze these assays. The error flipped the genotypes of the tethered stimulus animals at the ends of the chamber. After correcting this error and reanalyzing the data, we find that naïve WT males do not show a significant preference for naïve WT females over naïve Oxtr<sup>1-/-</sup> females. We have reconfirmed the metadata used in all assays in this study; no other datasets or conclusions are affected.

      While overall choice frequency is equivalent for males and females, our revised analyses demonstrate that Oxtr loss nonetheless alters the dynamics of social interactions in a sex-specific manner. In particular, the presence of an Oxtr<sup>1-/-</sup> male significantly alters WT females’ social behavior—enhancing prosocial engagement and reducing aggression—independent of which male is ultimately chosen. These findings support the conclusion that Oxtr function modulates early reciprocal social interactions rather than categorical choice outcomes.

      MOAT and LOAT seem like cumbersome acronyms, more so than something simpler like vole 1 vs vole 2.

      We have replaced these acronyms throughout the manuscript with the simpler, descriptive terminology; winner (MOAT) and loser (LOAT).

      Only three animals per condition seemed to have been used for RNA-Seq studies in Figure 5. Given the high behavioral variability in the earlier figures, did the authors screen for animals with exemplar or similar behavior within groups? The lack of significance of other genes or across other groups might just be due to a low-powered experiment given the high behavioral and genetic variability.

      We thank the reviewer for raising the important point regarding behavioral preselection, which has been performed in some similar studies. For our study, animals were not preselected based on exemplar or matched behavioral performance prior to tissue collection, as doing so would risk introducing variation in gene expression patterns due to the experience of complex social interactions. Instead, given that our prairie vole lines are maintained on an outbred background, tissue from three animals was pooled for each RNA-seq sample to reduce inter-individual variability and to capture representative transcriptional states within each experimental group. While this approach increases robustness to individual variability, we acknowledge that it may limit sensitivity to detect low expression behavior linked gene transcripts.

      On lines 426-429, the authors state that "While there was no significant difference in Oxtr transcript levels by genotype (padj = 0.753)-consistent with minimal nonsensemediated decay despite a premature stop codon-we have previously shown that no functional protein is produced in Oxtr1-/- animals (52)." This assertion could use strengthening, even if just to explain how this was verified in their previous publication. What is the evidence for nonsense decay and a full knockout of functional receptors at the protein level?

      We agree that this point benefits from clarification. Although Oxtr transcript levels were not significantly different by genotype (padj = 0.753), consistent with minimal nonsense-mediated decay, transcript abundance alone does not reflect receptor functionality. In our prior study, we directly assessed Oxtr protein function using receptor autoradiography and found a complete absence of specific ligand binding in Oxtr<sup>1-/-</sup> animals across brain regions that show robust Oxtr binding in wild-type voles, demonstrating a full loss of functional receptor protein. We have clarified this in our manuscript.

      Reviewer #2 (Public review):

      Summary:

      This manuscript uses a recently published oxytocin receptor null prairie vole line to examine the effects of this mutation on pair bonding behavior and PVN gene expression. Results reveal that Oxtr sex specifically influences early courtship behavior and partner preference formation as well as suppressing promiscuity toward novel potential mates. PVN gene expression varies between Oxtr null and WT prairie voles.

      Strengths:

      Behavioral analyses extend beyond the typical reporting of frequency and duration. The gene expression models and analyses are well-done and convincing. The experimental designs and approaches are strong.

      We thank Reviewer #2 for highlighting the strengths of the gene expression modeling and behavioral analyses.

      Weaknesses:

      More details and background literature explaining the role of the Oxt system in pair bonding behaviors is necessary, particularly for the Introduction. The authors overstate several times that Oxtr expression is not necessary for partner preference formation, based on their previous findings. However, it does appear, particularly, in the short cohabitation that it is necessary. Thus, the nuanced answer may be that Oxt may accelerate partner preference formation. Improving the presentation of the statistics and figures will make the manuscript more reader-friendly.

      We thank the reviewer for this thoughtful feedback and agree that additional background on the oxytocin (Oxt) system’s role in pair bonding will strengthen the manuscript. We have revised the introduction to expand our discussion of prior pharmacological and comparative studies suggesting that Oxt signaling modulates multiple components of pair bonding.

      Finally, in response to the reviewer’s suggestion, we have improved the presentation of figures and statistical reporting by interlacing figures with figure legends and updating the supplementary statistics table.

      Reviewer #2 (Recommendations for the authors):

      Major concerns

      (1) The Introduction provides a "broad strokes" approach to link the oxytocin and vasopressin systems as neuromodulators of social attachment processes. This study is a follow-up to a recent publication by the senior authors' groups which reported that the Oxtr null prairie voles were able to form typical pair bonds. Now, the authors are revisiting the same question by developing a series of behavioral assays to probe distinct aspects of pair bonding behavior. However, the Introduction lacks a nuanced examination of how the oxytocin system has been shown to regulate an array of social behaviors in prairie voles and other social species.

      We thank the reviewer for this observation and agree that the original Introduction did not capture the breadth and nuance of oxytocin system involvement in social behavior. We have substantially revised the Introduction in response to the reviewer’s suggestion to include a more detailed discussion of the role played by oxytocin signaling in social behaviors displayed across multiple phyla, including during the early stages of pair bonding.

      (2) In addition, there seems to be relevant viral Oxtr KD and KO studies in prairie voles which could be referenced to reflect differences between acute pharmacological Oxtr inhibition and prolonged viral KD of Oxtr on behavioral outcomes. This could also be put into context with the authors' first paper in prairie voles and others' work with mice showing how congenital Oxtr null rodent models may result in behavioral changes that are not reflected in the pharmacological or viral manipulation research. This could help justify the approach of the current study.

      We thank the reviewer for suggesting this comparison and have included a section in the discussion comparing pharmacological manipulations and global knock outs as well as the discrepancy in phenotypes that arise due to these methods. This expanded discussion clarifies why a congenital genetic model provides complementary insights: it allows us to identify which components of pair bonding are robust to developmental loss of Oxtr and which remain sensitive, thereby distinguishing between Oxtr-dependent behavioral modules and those supported by parallel mechanisms. Additionally, we have included viral manipulations of Oxtr in prairie voles during the early phase of interactions between the sexes in the introduction, to contextualize our study in the broader field. 

      (3) On lines 129-130: The authors state, "We previously found that Oxtr is not required for the display of partner preference following 1 week of cohabitation". While this is the general conclusion of their previous publication, this seems like a rather larger overgeneralization. There are many studies that have documented the functional regulation and necessity of the Oxt system for partner preference behavior in prairie voles. Therefore, it would be more appropriate to state that their previous study demonstrated that "Oxtr null prairie voles are able to develop a partner preference", but not that Oxtr is not necessary for partner preference formation. This may be a question about when the KO occurs, whether it be congenital or conditional.

      (4) This statement is repeated in Lines 350-352. However, the authors can now qualify this statement at this point in the manuscript with their new data which suggests that Oxtr null voles fail to form a partner preference after short cohabitation, but WT still form such preferences. This would suggest the qualification of this statement should be on the onset of partner preference formation as Oxtr is necessary for partner preference formation after a "short" cohabitation. Therefore, both findings are more in line with previous results which suggest that Oxt signaling accelerates partner preference formation.

      We have revised this language throughout the manuscript to state that our prior work demonstrated that Oxtr null voles are capable of forming a partner preference after extended cohabitation.

      (5) It appears Supplementary Table 1 is not scaled to the page size, so not all statistical results are clear. This limits the accuracy of my review.

      This table has been reformatted to ensure all statistical results are properly scaled to page size.

      (6) It is not always clear what statistical analyses are being performed. For example, how were the data in Figures 4G-H analyzed? What statistics were used and the output should be more readily available.

      During follow-up behavioral analyses prompted by Reviewer #1 requests to characterize the basis of the apparent WT male bias, we discovered a labeling error in the metadata associated with a subset of naïve three-chamber choice assays. In these cases, the genotypes of the tethered stimulus animals had been inadvertently flipped. After correcting this error and reanalyzing the data, we find that naïve WT males do not show a significant preference for naïve WT females over naïve Oxtr1-/- females. We have rechecked the metadata for all assays included in this study and confirmed that this was the only instance in which such an error occurred. We further analyzed the temporal dynamics of naive choice to find that Oxtr function modulates early reciprocal social interactions but does not affect the genotype ultimately chosen.

      To improve the clarity of the statistical analyses performed, we have reformatted our presentation of figure legends and our statistics table. All statistical tests, sample sizes, and relevant parameters (including exact tests used, correction methods where applicable, and definitions of units of analysis) are explicitly stated in the figure legends and compiled in the supplementary statistical summary table, in accordance with eLife reporting guidelines.

      (7) Oxytocin plays a critical role in development as early as embryogenesis. It may be useful to frame some of the Introduction and Discussion recognizing the congenital deletion of Oxtr may affect much of development. With that in mind, it is not surprising to see changes in gene expression associated with neurodevelopmental disorders.

      We now explicitly acknowledge in both the Introduction and Discussion that congenital Oxtr deletion likely impacts neural development which provides context for the observed enrichment of neurodevelopmental gene expression changes.

      Minor concerns

      (1) It was not clear why vasopressin was referenced in the Introduction. Specifically, the study documents that Oxtr null prairie voles have a reduction in Avp neurons in the PVN, which would suggest some aspects of Oxt signaling regulate Avp expression. However, the Introduction is not focused on how Oxt regulates the Avp system but rather on how each is a modulator of social attachment. It would improve the justification of this study to focus on Avp expression if the Introduction presented this concept.

      We thank the reviewer for pointing out the need for greater clarity around our reference to vasopressin (Avp) in the Introduction. We have simply stated that the potential for pair bonding is correlated with the patterns of expression of Oxtr and V1ar in the introduction. The goal of this study was to find evidence of behavior and gene expression changes due to the chronic loss of Oxtr which lead to our finding that a population of Avp neurons is lost in the animals lacking Oxtr. As we did not intend to justify our study on this basis, we have clarified our discussion to include previous studies where OT manipulation affects Avp neurons.

      (2) Figures and supplemental figures need figure legends.

      We have re-arranged the figure legends for each figure (including the supplementary figures) to follow the figures for easier readability and accessibility.

      (3) Figure 1 Timeline is focused more on the male timeline with "bond formation" and "bond maintenance" reflecting the days required to form a partner preference for males. The figure should be revised to reflect similar time points for female pair bonding.

      Figures have been revised to reflect each sex's bonding timeline.

      (4) Figure 1 has a color theme with females represented by red/pink and males represented by dark/light blue. However, this is not true for Figures 1C and 1D. Please revise these color schemes.

      Color schemes have been standardized across all figures.

      (5) It is not clear what is being graphed in Figures 2 and 3. The duration graphs have many more data points than the frequency graphs. Can this be explained?

      We thank the reviewer for pointing out this lack of clarity. The difference in the number of data points reflects how these measures are defined. Duration plots are generated at the level of individual huddle events, specifically pooling all huddles whose duration falls within the top quartile for a given animal, whereas frequency plots are generated at the level of individual animals and therefore contain one data point per subject. As a result, duration graphs necessarily include more data points than frequency graphs. The figure legends and Methods section explicitly state the unit of analysis for each metric and to clarify why the number of data points differs between duration and frequency plots.

      (6) What are the black bars in Figure 4H meant to represent?

      We thank the reviewer for this question. In the original submission, the black bars in Figure 4H were intended to indicate time periods showing statistically significant convergence in the chooser’s preference for the MOAT (More Of Assay Time, now winner) animal, based on the sliding preference index analysis. However, as mentioned during revision we identified a metadata error affecting the dataset used to generate this figure. After correcting the error, the figure was fully reanalyzed and regenerated. As a result, Figure 4H now presents a different analysis and no longer includes these black bars, and the conclusions drawn from this panel have been revised accordingly. The updated figure, legend, Results text and statistics table now accurately reflect the new analysis.

    1. eLife Assessment

      This important study shows that an odorant that is typically thought of as a repellant actually activates both attractant and repellant olfactory neurons in C. elegans. Convincing evidence is provided that nematode worms can integrate signals in different sensory pathways to drive different behavioral responses to the same cue. These findings will be of interest to scientists interested in combinatorial coding in sensory systems.

    2. Reviewer #1 (Public review):

      The authors investigated the response of worms to the odorant 1-octanol (1-oct) using a combination of microfluidics-based behavioral analysis and whole-network calcium imaging. They hypothesized that 1-oct may be encoded through two simultaneous, opposing afferent pathways: a repulsive pathway driven by ASH, and an attractive pathway driven by AWC. And the ultimate chemotactic outcome is likely determined by the balance between these two pathways.

      It is not surprising that 1-octanol is encoded as attractive at low concentrations and repulsive at higher concentrations. However, the novel aspect of this study is the discovery of the combinatorial coding of 1-oct in the periphery, where it serves as both an attractant and a repellent. Furthermore, the study uses this dual encoding as a model to explore the neural basis of sensory-driven behaviors at a whole-network scale in this organism. The basic conclusions of this study are well supported by the behavioral and imaging experiments, though there are certain aspects of the manuscript that would benefit from further clarification.

      A key issue is that several previous studies have demonstrated a combinatorial and concentration-dependent coding of odorant sensing in the nematode peripheral nervous system. Specifically, ASH and AWC are the primary receptors for repellent and attractive responses, respectively. However, other neurons such as AWB, AWA, and ADL are also involved in the coding process. These neurons likely communicate with different interneurons to contribute to 1-oct-induced outputs. The authors' conclusion that loss of tax-4 reduces attractive responses and that osm-9 mutants reduce repulsive responses is not entirely convincing. TAX-4 is required for both AWC (an attractive neuron) and AWB (a repulsive neuron), and osm-9 is essential for ASH, ADL, and AWA (attraction-associated). Therefore, the observed effects on the attractive and repulsive responses could be more complex. Additionally, the interpretation of results involving the use of IAA to reduce the contribution of AWC at lower concentrations lacks clarity.

      The authors did not observe any increased correlation between motor command interneurons and sensory neurons, which is consistent with the absence of a consistent relationship between state transitions and 1-oct application. Furthermore, they did not observe significant entrainment of AIB activity with the 2.2 mM 1-oct application. This might be due to the animals being anesthetized with 1 mM tetramisole hydrochloride, which could affect neural activity and/or feedback from locomotion.

      Comments on revisions:

      The authors have addressed all my previously raised concerns.

    3. Reviewer #2 (Public review):

      Summary:

      The authors used whole-network imaging to identify sensory neurons that responded to the repellant 1-octanol. While several olfactory neurons responded to the initial onset of odor pulses, two neurons consistently responded to all the pulses, ASH and AWC. ASH typically activates in response to repellants, and AWC typically activates in response to the removal of attractants. However in this case, AWC activated in response to the removal of 1-octanol, which was unexpected because 1-octanol is a harmful repellant to the worm. The authors further investigated this phenomenon by testing different concentrations of 1-octanol in a chemotaxis assay, and found that at lower (less harmful) concentrations the odor is actually an attractant, but becomes repulsive at higher concentrations. The amplitude of the ASH response appeared to be modulated by concentration, but this was not true for AWC. The authors propose a model where the behavioral response of the worm is the result of integrating these two opposing drives, where repulsion is a result of the increased ASH activity over-riding the positive drive from AWC. The authors further tested this theory by testing mutants that ablated the AWC response (tax-4 or AWC::HisCl) or ASH response (osm-9 or ASH::HisCl). The chemo-silencing (HisCl) and tax-4 experiments were consistent with their hypothesis, while the osm-9 mutation had a limited impact on chemotaxis behavior, highlighting the potential role of osm-9-independent signaling in ASH in response to 1-octanol. While the interneuron(s) that integrate these signals to influence behavior were not identified, the authors did find that increasing concentrations of 1-octanol did increase the likelihood of AVA activity, a neuron which drives reversals (and hence, behavioral repulsion).

      Strengths:

      This was simple and elegant work that identified specific neurons of interest which generated a hypothesis, which was further tested with mutants that altered neuronal activity. The authors performed both neuronal imaging and behavioral experiments to verify their claims.

      Weaknesses:

      The authors note that other sensory neurons likely contribute to 1-octanol chemotaxis. Given the NeuroPAL data, it would have been nice to identify these other neurons as well. However, the reviewer is aware that this is tangential to the primary focus of this study.

    4. Reviewer #3 (Public review):

      Summary:

      This work describes how two chemosensory neurons in C. elegans drive opposite behaviors in response to a volatile cue. Because they have different concentration dependencies, this leads to different behavioral responses (attraction at low concentration and repulsion at high concentration). It has been known that many odorants that are attractive at low concentrations are aversive at high concentrations, and the implicated neurons (at least AWC for attraction and ASH for repulsion) have been well established. None the less, by studying behavior and neural responses in a common context (odor pulses, as opposed to gradients) this provides a clear picture of how these sensory neurons may guide the dose dependent response by separately modulating odor entry and odor exit behaviors.

      Strengths:

      (1) This work provides good evidence that worms are attracted to low concentrations and repelled by high concentrations of 1-oct. Calcium imaging also makes it clear that dose-dependence of this response is stronger for ASH than AWC.

      (2) This work presents calcium imaging and behavior with the same stimulus (sudden pulses in volatile odor concentration), while previous studies often focus on using neuronal responses to pulses to understand navigation of gentle gradients.

      Weaknesses:

      (1) As a whole it is not clear precisely how important AWC is (compared to other cells) for the attractive response (as the authors correctly acknowledge).

      (2) The evidence that AIB minus AVA contains relevant information is weak. It appears the entrainment index in Fig. 6H for AIB-AVA could easily be explained by the negative entrainment between AVA and the stimulus (along with no effect or role for AIB). This is suggested by the similar p-values and similar distribution of random EIs (stretched and mirrored) between the first and last rows of this figure.

      (3) The model in Figure 7 would be strengthened if it was demonstrated that IAA is attractive when worms are saturated in a 1/10^4 concentration. Panel 7G (and ref. 39) indicate that 10^-4 IAA activates ASH, which would suggest a different explanation for the change from attraction to repulsion in 7C.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      …other neurons such as AWB, AWA, and ADL are also involved in the coding process. These neurons likely communicate with different interneurons to contribute to 1-octinduced outputs. The authors' conclusion that loss of tax-4 reduces attractive responses and that osm-9 mutants reduce repulsive responses is not entirely convincing. TAX-4 is required for both AWC (an attractive neuron) and AWB (a repulsive neuron), and osm-9 is essential for ASH, ADL, and AWA (attraction-associated). Therefore, the observed effects on the attractive and repulsive responses could be more complex. Additionally, the interpretation of results involving the use of IAA to reduce the contribution of AWC at lower concentrations lacks clarity. A more effective approach might involve using transgenically expressed miniSOG or histamine (HisCl1) to specifically inhibit AWC neurons.

      We agree that the sensory inputs into chemotactic behavior are likely more complex, involving other neurons besides ASH and AWC. We now explicitly discuss possibility in the Discussion (lines 449-467).

      We have also utilized transgenically expressed HisCl1 in ASH and AWC to address this concern. Crucially, we observe that some of the effects of the broad mutations are reproduced by inactivating ASH and AWC. This finding validates our overall hypothesis that sensory-driven behavior is a balance of simultaneous afferent inputs of opposite valence AND shows that ASH and AWC are involved as expected. We are currently performing a comprehensive analysis of sensory inputs into locomotory decision making, including the neurons mentioned in the Reviewer’s comment.

      We also agree that using IAA is not a very clean way to inactivate AWC. The AWC HisCl results referenced above should alleviate this concern. However, the IAA result does put our findings into a broader context of multi-sensory integration which demonstrates the potential usefulness and selective advantages of the dual-input coding architecture that we are hypothesizing.

      Furthermore, they did not observe significant entrainment of AIB activity with the 2.2 mM 1-oct application. This might be due to the animals being anesthetized with 1 mM tetramisole hydrochloride, which could affect neural activity and/or feedback from locomotion. 

      We now mention these caveats “It is possible that immobilization and anesthetization may be affecting AIB responses to sensory activity and/or proprioceptive feedback from locomotion. However, it is also possible that motor feedback from RIM was obscuring the sensory signal.” Line 357

      It is unclear whether subtracting AVA activity from AIB activity provides a valid measure. Similarly, it is unclear how the behavioral data from freely moving worms compares to the whole-network calcium imaging results obtained from immobilized worms.

      Ray and Gordus 2025 (Current Biology 35:5534) recently demonstrated that AIB activity can be modeled as the additive convolution of AVA, AWC, and AIA activity, lending validity to our subtractive approach. In their study, AVA was the major contributor, but addition of AWC and AIA signals (i.e. sensory inputs) resulted in a significant greater accuracy. We have now mentioned their work in the manuscript (line 363) “To address this possibility, we subtracted AVA activity, representing the motor state, from the AIB activity (AVA closely mirrors RIM), based on the observation that AIB activity can be modeled as the sum of convolutions of motor activity and sensory activity.” (lines 360-363)

      The relationship between network activity in freely moving worms and immobilized worms has been explored by Kato et al 2015 (Cell 163:656-669); we now refer to this work on line 131 “These transitions are related to network state changes which drive spontaneous reversals during foraging in freely moving worms. Immobilization and anesthetization, necessary for confocal imaging, distort certain aspects of these motor command sequences compared to freely moving worms executing the motor commands and receiving proprioceptive feedback. However, the intrinsic motor programs remain intact under these conditions.” (lines 131-136)

      Reviewer #2 (Public review):

      tax-4, but not osm-9 mutants were used in chemotaxis and imaging assays. It would have been nice to have osm-9 results as well for these assays. The mutants are not specific to AWC and ASH. Cell-specific rescue of these neurons would have strengthened the proposed model.

      Osm-9 data are now included in the chemotaxis assays (Fig. 4E).

      Cell-specific HisCl data are now included for ASH and AWC (Fig. 4F, G, 5D), confirming our proposed model.

      Limited tax-4 data were included in the imaging (Fig. 6), but unfortunately, NeuroPAL imaging in tax-4 has proven to be technically difficult. NeuroPAL images in the tax-4 background appear different, perhaps because of developmental effects on gene expression due to the lack of sensory input (recall that the NeuroPAL color scheme is based on the relative expression levels of 40+ neuronal promoters). Inactivation of individual sensory neurons using HisCl1 or other transgenes may be the simpler approach.

      The Results and Discussion have been significantly rewritten to incorporate these new data

      We are currently working on a comprehensive study of the sensory inputs into locomotory decision making in the context of chemosensation, which we expect to reveal roles of other neurons besides ASH and AWC and provide a fuller picture of the complexities of this system.

      Reviewer #3 (Public review):

      (1) It is not clear precisely how important AWC is (compared to other cells) for the attractive response, though the presence of odor-off behavior implicates it. This could be resolved by looking at additional mutants (tax-4 is broad).

      We have addressed this concern using transgenically-expressed HisCl1 which has demonstrated a clear role for AWC in overall chemotaxis and locomotory decision making upon encountering the 1-oct/buffer interface in microfluidics devices (Fig. 4F, G, 5D).

      (2) Relatedly, dose-dependent chemotaxis data (Figure 4C, D) should be provided for osm-9 animals to get a sense of the degree to which dose-dependence is explained by ASH.

      Osm-9 data now included (Fig. 4E)

      The Results and Discussion have been significantly rewritten to incorporate these new data

      (3) Figure 4A, B should include average traces with errors, as there are several ways the responses can vary across conditions.

      Averaged traces with error bars now shown (Fig. 4A, B)

      (4) The data in Figure 6G does not appear to have error bars.

      Error bars now shown for 6G

      Also, it would help to include a more conventional demonstration of AIB responding to stimuli (e.g. averaging stimulus-aligned responses as a percent of the fluorescence value at stimulus onset to perform the desired subtraction).

      Fig. 6G top panel shows the stimulus-aligned responses of AIB with no subtraction performed. The 6 sequential stimulations are shown as a single continuous trace, consistent with the experimental protocol utilized. Averaging was performed across the 12 individuals of the sample set. However, we did not calculate the average of responses within a dataset (i.e. first plus second plus third etc.) to avoid obscuring any sensitization/desensitization that might be occurring with multiple stimuli.

      Subtracted calcium traces are harder to interpret. As it stands, the evidence that sensory signals are persisting in AIB and not being shunted by proprioceptive feedback in microfluidic devices is not strong.

      Addressing the point about proprioceptive feedback in microfluidics devices, the following sentence was added in the Results section: “Immobilization distorts certain aspects of these motor command sequences compared to freely moving worms executing the motor commands and receiving proprioceptive feedback, but the intrinsic motor programs remain intact.” (lines 131-136).

      To add context for the AIB-AVA subtraction, Ray and Gordus 2025 (Current Biology 35:5534) recently demonstrated that AIB activity can be modeled as the additive convolution of AVA, AWC, and AIA activity, lending validity to our subtractive approach. In their study, AVA was the major contributor, but addition of AWC and AIA signals (i.e. sensory inputs) resulted in a significant greater accuracy. We have now mentioned their work in the manuscript: “To address this possibility, we subtracted AVA activity, representing the motor state, from the AIB activity (AVA closely mirrors RIM), based on the observation that AIB activity can be modeled as the sum of convolutions of motor activity and sensory activity.” (lines 360-363)

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Figure 1: The number of replicates (n) is missing.

      In Fig. 1D, only a single trial is shown as a representative example rather than averages, which would necessitate error bars. The Results and Figure Legend text has been updated to clarify this, and the average CI is now included in the first Results section (lines 111, 976)

      Figure 4: The sample size (n = 3-5) is relatively small, which may limit the statistical power.

      Sample size was increased to 5 for all data points shown on the new graph (Fig. 4E and noted in the figure legend (line 1019)

      Figure 4: The 0.22 mM concentration significantly affects both AWC and ASH. It is also unclear whether this concentration also affects other neurons, such as AWB, ADL, and AWA.

      We have not performed exhaustive analysis of other neurons in these datasets. These analyses are difficult and time consuming, so we have opted to present a dataset which supports our hypothesis that multiple afferent pathways of opposite valence act in a balanced way to drive chemotaxis. We are currently performing an in-depth analysis of the sensory inputs into the circuit, which we expect to present in a future study

      Reviewer #2 (Recommendations for the authors):

      The tax-4 and osm-9 experiments are great, but I recommend clarifying that tax-4 and osm-9 are expressed in other neurons as well. The text gives the impression that these mutants are specific to AWC and ASH, respectively. The authors should note these caveats.

      This concern is thoroughly addressed in the descriptions and rationale presented for the use of ASH and AWC HisCl strains.

      The authors should also provide the code used to interpret their results.

      Code will be provided through Zenodo.org

      Reviewer #3 (Recommendations for the authors):

      It would help to clarify (early on) the degree to which you are attributing responses to particular cells (e.g. AWC) as opposed to a class of cells with AWC as an example.

      This concern is thoroughly addressed in the descriptions and rationale presented for the use of ASH and AWC HisCl strains.

      The NeuroPAL imaging and analysis (especially Figures 3D, E) is a bit distracting and appears non-essential. If possible, it would help to combine Figures 2 and 3 with a focus on panels 3ABC to streamline the narrative.

      We would prefer to keep the present format so the reader can appreciate the power of the whole-brain approach for analyzing network activity and behavioral outputs in the context of sensory-motor responses. Specifically, our insight that attractive and aversive afferent inputs were activated simultaneously was wholly dependent on this approach. Otherwise, there would have been little to no reason for examining AWC activity at aversive 1-oct concentrations, which was essentially the foundation of the study.

      To highlight this point, we have added the following sentence in the Discussion: “This novel insight highlights the value of the whole-brain approach (enabled by the NeuroPAL system) for studying the network dynamics underlying sensory driven behaviors.” Lines 431-433.

    1. eLife Assessment

      The findings in this paper provide solid support for a hypothesis that has valuable implications at the intersection of value-based and social decision-making. The findings suggest that the brain processes rewards received for effort differently when they are earned for themselves versus someone else.

    2. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      The Authors test the hypotheses, using and effort-exertion and an effort-based decision-making task, while recording brain dynamics with EEG, that the brain processes reward outcomes for effort differentially when they earned for themselves versus others.

      Strengths:

      The strengths of this experiment include what appears to be a novel finding of opposite signed effects of effort on the processing of reward outcomes when the recipient is self versus others. Also, the experiment is well-designed, the study seems sufficiently powered, and the data and code are publicly available.

      Weaknesses:

      There is some concern about the fact that participants report feeling less subjective effort, but also more disliking of tasks when they were earning rewards for others versus self. The concern is that participants worked with less vigor during self-versus-others trials and this may partly account for a key two-way Recipient x Effort interaction on the size of the Reward Positivity EEG component. Of note, participants took longer to complete tasks when working for others. While it is true that, in all cases, participants met the requisite task demands (they pressed the required number of buttons) they did so more sluggishly when earning rewards for others. The Authors argue that this reflects less motivation when working for others, which is a plausible explanation. The Authors also try to rule out this diminished vigor as a confounding explanation by showing that the two way interaction remains even when including reaction times (and also self-reported task liking) as a covariate. Nevertheless, it is possible that covariates do not fully account for the effects of differential motivation levels which would otherwise explain the two-way interaction. As such, I think a caveat is warranted regarding this particular result.

    3. Reviewer #2 (Public review):

      Summary:

      Measurements of the reward positivity, an electrophysiological component elicited during reward evaluation, have previously been used to understand how self-benefitting effort expenditure influences processing of rewards. The present study is the first to complement those measurements with electrophysiological reward after-effects of effort expenditure during prosocial acts. The results provide solid evidence that effort adds reward value when the recipient of the reward is the self but discounts reward value when the beneficiary is another individual.

      Strengths:

      An important strength of the study is that amount of effort, the prospective reward, the recipient of the reward, and whether the reward was actually gained or not were parametrically and orthogonally varied. In addition, the researchers examined whether the pattern of results generalized to decisions about future efforts. The sample size (N=40) and mixed-effects regression models are also appropriate for addressing the key research questions. Those conclusions are plausible and adequately supported by statistical analyses.

    4. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The authors test the hypotheses, using an effort-exertion and an effort-based decision-making task, while recording brain dynamics with EEG, that the brain processes reward outcomes for effort differentially when they earned for themselves versus others.

      Strengths:

      The strengths of this experiment include what appears to be a novel finding of opposite signed effects of effort on the processing of reward outcomes when the recipient is self versus others. Also, the experiment is well-designed, the study seems sufficiently powered, and the data and code are publicly available.

      Weaknesses:

      There is some concern about the fact that participants report feeling less subjective effort, but also more disliking of tasks when they were earning rewards for others versus self. The concern is that participants worked with less vigor during self-versus-others trials and this may partly account for a key two-way Recipient x Effort interaction on the size of the Reward Positivity EEG component. Of note, participants took longer to complete tasks when working for others. While it is true that, in all cases, participants met the requisite task demands (they pressed the required number of buttons) they did so more sluggishly when earning rewards for others. The Authors argue that this reflects less motivation when working for others, which is a plausible explanation. The Authors also try to rule out this diminished vigor as a confounding explanation by showing that the two way interaction remains even when including reaction times (and also self-reported task liking) as a covariate. Nevertheless, it is possible that covariates do not fully account for the effects of differential motivation levels which would otherwise explain the two-way interaction. As such, I think a caveat is warranted regarding this particular result.

      We thank Reviewer #1 for the continued positive assessment and for continuing to highlight the caveat regarding the potential influence of differential vigor on the observed RewP interaction effects.

      We agree that a caveat is warranted. As detailed in our previous response (R5), we had already conducted control analyses addressing this concern; however, we acknowledge that these results were not incorporated into the manuscript itself. We have now addressed this by adding the covariate analyses to the Result section, along with an explicit caveat in the Discussion.

      Before describing the specific revisions, we would like to offer a minor clarification: the covariates in our control analyses were trial-by-trial response speed and self-reported effort ratings, rather than task liking ratings as noted in the summary above. Neither response speed nor effort rating predicted RewP amplitudes, and the critical Recipient × Effort and Recipient × Effort × Magnitude interactions remained significant and essentially unchanged. However, as the reviewer rightly pointed out, covariates may not fully capture the effects of differential motivation. Specifically, we have made the following revisions:

      First, we added the covariate control analyses to the Result section: “To rule out the possibility that the differential vigor between self- and other-benefiting trials drove the Recipient × Effort and Recipient × Effort × Magnitude interactions on the RewP, we conducted two control analyses by including trial-by-trial response speed and subjective effort ratings as separate covariates in the RewP model. Neither response speed (b = -0.07, p = .641) nor effort rating (b = 0.10, p = .186) predicted RewP amplitudes, and the critical Recipient × Effort and Recipient × Effort × Magnitude interactions remained significant and essentially unchanged (see Supplementary Table S3 for full regression estimates)” (page 12, para. 1).

      Second, we added a caveat to the Discussion section acknowledging this alterative explanation, which reads, “Another concern is that participants exhibited less vigor when working for others, as indicated by slower response speed and lower subjective effort ratings for other- versus self-benefiting trials. Although our control analyses confirmed that neither covariate predicted RewP amplitudes and the critical interactions remained significant, covariates may not fully capture the effects of differential motivation, and this alternative explanation cannot be entirely ruled out” (page 22, para. 2, lines 9–12; page 23, para. 1).

      Reviewer #2 (Public review):

      Summary:

      Measurements of the reward positivity, an electrophysiological component elicited during reward evaluation, have previously been used to understand how self-benefitting effort expenditure influences processing of rewards. The present study is the first to complement those measurements with electrophysiological reward after-effects of effort expenditure during prosocial acts. The results provide solid evidence that effort adds reward value when the recipient of the reward is the self but discounts reward value when the beneficiary is another individual.

      Strengths:

      An important strength of the study is that amount of effort, the prospective reward, the recipient of the reward, and whether the reward was actually gained or not were parametrically and orthogonally varied. In addition, the researchers examined whether the pattern of results generalized to decisions about future efforts. The sample size (N=40) and mixed-effects regression models are also appropriate for addressing the key research questions. Those conclusions are plausible and adequately supported by statistical analyses.

      We sincerely appreciate Reviewer #2’s positive evaluation of our manuscript and thank the reviewer for recognizing the strength of our experimental design and analysis approach.

    1. eLife Assessment

      This landmark study investigates how patterned human gastruloids can provide insights into neural tube closure. Using a screen, they identified positive and negative regulators and defines the epistasis among them using optimization of micro-pattern based gastruloid protocol and CRISPRi. This technical tour de force is exceptional and one of the first studies to reveal new knowledge on human development through embryo models.

    2. Reviewer #1 (Public review):

      Summary:

      This is a wonderful and landmark study in the field of human embryo modeling that uses patterned human gastruloids and conducts a functional screen on neural tube closure and identified positive and negative regulators and defines the epistasis among them.

      Strengths:

      This was achieved following optimization of micro-pattern based gastruloid protocol to achieve high efficiency, and then optimize was to conduct and deliver CRISPRi without disrupting the protocol. This is a technical tour de force as well as one of the first studies to reveal new knowledge on human development through embryo models which has not been done before.

      Weaknesses:

      A minor one. One can never find out if findings in human embryo models can be in vitro revalidated in humans in vivo for obvious and justified ethical reasons. However, the authors indicate that in the "limitations of study" section.

      Comments on revisions:

      The authors have adequately addressed all comments raised.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This is a wonderful and landmark study in the field of human embryo modeling. It uses patterned human gastruloids and conducts a functional screen on neural tube closure, and identifies positive and negative regulators, and defines the epistasis among them.

      Strengths:

      The above was achieved following optimization of the micro-pattern-based gastruloid protocol to achieve high efficiency, and then optimized to conduct and deliver CRISPRi without disrupting the protocol. This is a technical tour de force as well as one of the first studies to reveal new knowledge on human development through embryo models, which has not been done before.

      The manuscript is very solid and well-written. The figures are clear, elegant, and meaningful. The conclusions are fully supported by the data shown. The methods are well-detailed, which is very important for such a study.

      Thank you for this feedback! We are excited for the possibilities of this method to discover genes required for various morphogenetic processes associated with human embryonic development.

      Weaknesses:

      This reviewer did not identify any meaningful, major, or minor caveats that need addressing or correcting.

      A minor weakness is that one can never find out if the findings in human embryo models can be in vitro revalidated in humans in vivo. This is for obvious and justified ethical reasons. However, the authors acknowledge this point in the section of the manuscript detailing the limitations of their study.

      Reviewer #2 (Public review):

      Summary:

      This manuscript is a technical report on a new model of early neurogenesis, coupled to a novel platform for genetic screens. The model is more faithful than others published to date, and the screening platform is an advance over existing ones in terms of speed and throughput.

      Thank you for this feedback! We agree that the robust symmetry breaking observed in our model, the comparisons to the human embryo in our cell type analysis, and the ability to conduct large-scale genetic screens represent advancements in the modeling of human neural tube closure that may be built upon in the future.

      Strengths:

      It is novel and useful.

      Weaknesses:

      The novelty of the results is limited in terms of biology, mainly a proof of concept of the platform and a very good demonstration of the hierarchical interactions of the top regulators of GRNs.

      The value of the manuscript could be enhanced in two ways:

      (1) by showing its versatility and transforming the level of neural tube to midbrain and hindbrain, and looking at the transcriptional hierarchies there.

      We thank the reviewer for this valuable suggestion and will keep this in mind for future work. As accurate answers to this question would require the development of robust midbrain and hindbrain organoid models, we believe that this question is outside the scope of the present work.

      (2) by relating the patterning of the organoids to the situation in vivo, in particular with the information in reference 49. The authors make a statement "To compare our findings with in vivo gene expression patterns, we applied the same approach to published scRNA-seq data from 4-week-old human embryos at the neurula stage" but it would be good to have a more nuanced reference: what stage, what genes are missing, what do they add to the information in that reference?

      We agree that a more comprehensive comparison of in vitro and in vivo data would add value to the study. We have added an analysis of the human Week 3 data, as neurulation occurs between Weeks 3 and 4 of human embryogenesis (new Figure 1F). We see our in vitro cell types in both datasets. We also included volcano plots in our supplementary figure to show major differences in gene expression (new Figure S1G). Somewhat surprisingly, embryo samples show higher expression of hemoglobin subunits and other hypoxia-related genes than organoids do, which may indicate hypoxic stress during sample handling during ex vivo experimentation (Schelshortn, et al., 2008) or alternatively, reflect differences in the metabolic environment between embryos and organoids. We did not find any differences would have affected our transcription factor candidate selection.

      Recommendations for the authors:

      Reviewing Editor Comments:

      The reviewers were very enthusiastic about the work and provided suggestions for textual changes that will clarify the figures, methods, and results for readers.

      Reviewer #2 (Recommendations for the authors):

      (1) In Figure 1:

      (a) What is the orientation of the images in 1C?

      We have specified in the text and figure legend that this is a top-down view of an outer organoid.

      In this panel, what is the problem with ZO-1 in D4?

      We believe this is non-specific staining of dead cells that shed into the lumen during folding and closure. We have added this interpretation to the figure legend and added two supplementary time lapse videos (new Supplementary Video 1 and new Supplementary Video 2) of organoid closure that show dead cells being shed into the lumen as support to this interpretation.

      (b) What is the three-dimensional organization of these structures, if any? Or are they two-dimensional? In a way, this also refers to 1C.

      We have clarified in the text and figure legend that these organoids are three dimensional, and that Fig. 1B-C are top-down views.

      (c) Why can't we see FOXG1 amidst the markers forebrain? This is a very characteristic one.

      We see sparse FOXG1 expression in the human embryo samples at Week 4 (new Figure 1F), which may indicate that FOXG1 expression is upregulated later in the human embryo, after neural tube closure. We do see high levels of other fore brain associated transcription factors by this time however, including OTX2, LHX2, and SIX3.

      (d) The Figure 1 legend needs to be clear about the issues raised here.

      We have updated the Figure 1 legend to address these points.

      (2) Figure 2, could they explain in the text better how they organize the ML gene expression? What are their criteria?

      We thank the reviewer for catching this critical omission. We have added details of our medio lateral axis generation to the Methods section under “Single cell RNA sequencing analysis.”

      (3) Explain how and why the 77 genes were picked up?

      We have clarified at our first mention of 77 genes that this is a subset of our original 78 candidate genes, which were selected as described in the text (last paragraph in the results section “Identifying transcription factor candidates for regulation of anterior neurulation”. We have added a line in the Methods section that we were unable to clone a functional guide plasmid against one our candidates (NR6A1).

      (4) The authors mention the value of the geometry and the mechanics in neural tube closure, but they make no attempt to unravel these inputs, or at least the genes, from their screen, associated with them.

      We have rewritten this discussion of the literature to emphasize the active role of the neural ectoderm compared to the surface ectoderm, in order to justify the genetic analysis of the neural ectoderm rather than the surface ectoderm. We have clarified that our goal is to find upstream developmental drivers (transcription factors) of folding and closure, rather than investigate mechanical mechanisms of this process.

    1. eLife Assessment

      This important study employed a multi-stage behavioural paradigm of increasing cognitive complexity to investigate the role of inhibitory interneurons in the medial prefrontal cortex (mPFC) in avoidance behaviour in mice. The authors used imaging and optogenetic techniques, combined with this behavioural task, to show that mPFC interneurons are necessary for encoding but not for executing avoidance under threat. The evidence supporting these claims is compelling, and findings will be of interest to researchers in behavioural and systems neurosciences.

    2. Reviewer #1 (Public review):

      Summary:

      This study investigates the role of the medial prefrontal cortex (mPFC) in generating goal-directed actions under threat, using a progressive behavioral paradigm, neural recordings, and optogenetic inhibition in mice. The authors demonstrate that while mPFC GABAergic neurons strongly encode cues, actions, and errors, particularly under high cognitive demand, this neural activity is not causally required for executing avoidance behaviors. By rigorously controlling for movement and arousal, the researchers found that much of the observed mPFC signaling actually reflects baseline behavioral states rather than the generation of the actions themselves. This dissociation between encoding and causality challenges traditional views of mPFC as an executive controller of action and provides a nuanced understanding of its role in evaluative and contextual processing.

      Strengths:

      The behavioral paradigm employed in this study is one of its greatest strengths, offering a rigorous, progressive, and well-controlled framework to dissect the neural mechanisms underlying avoidance under threat. This three-phase task design is particularly well-suited to tease apart the contributions of learning, discrimination, and cognitive load to both behavior and neural activity.

      By tracking movement (speed, rotations) and including it as a covariate in statistical models, the authors also underscore the need to control for movement and baseline activity when interpreting cortical signals, which is relevant for all studies of brain-behavior relationships, ensuring that behavioral changes are not due to general arousal or motor activity.

      Finally, the study combines multiple advanced techniques-fiber photometry, single-cell calcium imaging (miniscopes), and two distinct optogenetic inhibition methods-to provide a comprehensive look at both neural encoding and causal necessity.

      Weaknesses:

      The authors conclude that mPFC is not required for avoidance, based on the minimal behavioral effects of optogenetic inhibition. While this interpretation is supported by the data, the choice of viral constructs could lead to an underestimation of the mPFC's role for other reasons. First, the choice of viral constructs could lead to an underestimation of the mPFC's role for several reasons. Specifically, the efficacy of eArch3.0 inhibition was not verified beyond histology, and its non-cell-type-specific nature could lead to disinhibition or compensatory activity in downstream regions. Although the authors' use of visual cortex (VI) inhibition as a control suggests that broad cortical inhibition does not impair avoidance, subcortical compensation cannot be ruled out. Additionally, Vgat-ChR2 targets only GABAergic neurons, potentially missing glutamatergic contributions. Addressing these limitations in the Discussion section would strengthen the manuscript.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript by Sajid et al. describes a comprehensive behavioral, imaging, and optogenetic dataset investigating the role of the mPFC in avoidance and escape behaviors. Although many movement- and task-related variables are encoded by mPFC GABAergic neurons, the main conclusion is that they are unlikely to control behavioral output.

      Strengths:

      The manuscript is generally well executed and plausible in its conclusions. It provides an alternative viewpoint to many articles describing the involvement of mPFC in behavior, based on a complex multi-stage behavioral paradigm acquired and analyzed in an unbiased way.

      Weaknesses:

      This reviewer sees three main weaknesses.

      (1) There are few details on the linear mixed models in the methods. This section could be improved by including a mathematical description. More importantly, the reader never learns how accurately the models capture the data. Given that most conclusions rely on the models, it seems central to address this point carefully. For example, what is the explained variance, marginal, and conditional? Were the nested models compared to non-nested ones (e.g., AIC), what are the specific outputs of the likelihood ratio tests briefly mentioned in the methods?

      (2) For several figures, there is a disconnect with the main text, in the sense that it is difficult to understand how statements in the main text connect with specific figure panels or bars in their graphs. This is particularly the case for the most complex figures, e.g., Figures 3, 4, and their supplements. It would be beneficial to introduce subfigure labels (A1, etc) and state explicitly in the main text what figure panel is described (in parentheses). Alternatively breakdown the figures into multiple ones, decreasing ambiguity. This is important because it will help the reader better assess the strength of the results.

      (3) It does not appear that the code and data used to produce the figures are made available. That would be very beneficial, given the complexity of the analysis and dataset collection procedures. It would also help readers better understand the results and probe their validity.

    4. Reviewer #3 (Public review):

      I first want to state that I am not an expert in the field, making it hard for me to provide informed comments on the value of the scientific results. But from where I stand, the study seems very carefully designed, very well controlled, and the statistical methodology used across the manuscript is strong and sound.

      Summary:

      The authors investigated the role of PFC interneurons in cue-guided behaviour under threat. They designed a behavioural task with increasing levels of difficulty that allows them not only to correlate the activation of cortical interneurons with different parameters of the tasks, but also to assess if this correlation changes with increasing cognitive load. They carefully take into account confounding factors such as movement and show that indeed neuronal activity is strongly driven by movement. Using generalised linear models throughout their manuscript, the authors could include movement as a confounding factor in their statistical analysis, thus allowing them to next correlate interneuron activity with task-specific parameters. Using first fibre photometry to image bulk activity of the interneurons and by comparing the responses in the PFC and in the visual cortex, they identify that PFC neurons show stronger activation related to punishment compared to the sensory cortex. Interestingly, under high cognitive demand, PFC interneurons show cue-specific activation, which could reflect the involvement of the PFC in cue-selective action selection.

      In a second set of experiments, they use Miniscope to image individual interneurons. They classified interneurons, not based on their expression of specific markers as usually done, but based on their correlation with movement. Using this classification, they identify clusters of neurons that show activity modulation related to various behavioural parameters.

      Lastly, they performed optogenetic manipulations to silence the PFC during cue-guided behaviour and showed little behavioural effect of the manipulation, which they suggest means the PFC is not involved in taking action in this task.

      Strengths:

      The design of the study is backed by convincing arguments from the authors. The confounding factors are carefully taken into account and integrated into state-of-the-art statistics. The results thus appear robust and reliable. The authors do not overinterpret their results; quite the contrary, they are prone to toning down the interpretation of statistically significant results and they warn the readers about potential misinterpretation or confounding factors. The discussion makes for a very interesting and informative reading.

      Weaknesses:

      The main weakness, in my view, lies in the Results section. In the figures, the authors do not present any raw data, and the plots are shown as mean {plus minus} SEM without displaying the distribution of individual data points. It is both a strength and a weakness that the authors do not attempt to guide the reader through the Results section and instead present the findings with very little emphasis on the key outcomes of the GLM. While this approach is arguably the most transparent way to report results, it also makes the section quite difficult to follow and may discourage readers.

      I would recommend rewriting the Results section to make it more accessible to a broader audience. A similar issue applies to the figures: presenting all plots reflects a commendable commitment to transparency, but it would greatly benefit from a clearer narrative. As it stands, it is difficult to grasp the message of each figure by simply browsing through them.

    1. eLife Assessment

      This important technical study introduces SCOPE, an optics-free spatial reconstruction method based on bidirectional sender and receiver oligonucleotides on barcoded hydrogel beads. By sequencing proximity-encoded chimeric molecules, the authors computationally reconstruct 2D and 3D spatial information at an impressive scale. The technical demonstrations in synthetic bead systems are convincing and establish proof-of-principle that large spatial domains can be reconstructed without microscopy. The methodological advance is clear and the scale is impressive. Direct validation in biological samples would help clarify what additional limitations on applicability may exist. This work will be of interest to those working on spatial mapping.

    2. Reviewer #1 (Public review):

      Summary:

      Liao et al. present SCOPE (Spatial reConstruction via Oligonucleotide Proximity Encoding), a method for reconstructing spatial organization from diffusion-defined DNA barcode interactions without the use of optical imaging. In SCOPE, hydrogel beads bearing unique DNA barcodes contain both "sender" and "receiver" oligonucleotides. Upon enzymatic release, sender oligos diffuse locally and hybridize to receiver oligos on neighboring beads, forming chimeric molecules that encode spatial proximity. Sequencing these products yields an interaction matrix, which is then used to reconstruct a spatial coordinate map.<br /> The authors demonstrate reconstruction of synthetic two-dimensional shapes, a large multicolor Snellen eye chart, and the interior surface of three-dimensional molds. The work expands the conceptual and experimental landscape of optics-free spatial sequencing.

      Strengths:

      SCOPE employs bidirectional sender and receiver oligonucleotides on every bead, rather than using asymmetric transmitter-receiver architectures found in other diffusion-based methods. The symmetric design may improve detection sensitivity and reconstruction strategies, and represents a meaningful variation on optics-free spatial encoding.

      A notable strength of this study is the physical scale achieved. The authors reconstruct a Snellen chart spanning approximately 704 mm² and demonstrate molded 3D structures on the order of 75-100 mm³. Although some larger-scale warping is evident, and is discussed as potentially due to non-uniform diffusion, the relative local positioning across these large areas appears impressively accurate.

      The authors extend reconstruction beyond two-dimensional arrays to three-dimensional molded surfaces. This demonstrates that the assay and the computational methods for interpreting proximity graphs can support non-planar spatial relationships, expanding the scope of optics-free spatial inference.

      Weaknesses:

      Although the method is discussed in the context of spatial genomics and potential tissue applications, it is currently demonstrated only on engineered two-dimensional bead arrays and three-dimensional shapes fabricated in molds. It remains unclear how SCOPE would perform in heterogeneous biological environments, where diffusion may exhibit additional non-uniformities. A biological proof-of-concept, even limited in scope, would help define the method's strengths and limitations more clearly.

      The reconstruction of three-dimensional structures lacks strong sampling from volume interiors. This is speculated to be due to several possible factors; however, this limitation constrains the method to reconstruction of volume surfaces rather than comprehensive three-dimensional profiling.

      The reconstruction workflow involves multiple preprocessing steps and embedding choices. While these appear to work well for synthetic shapes with known geometry, it is less clear how parameter choices would be made in contexts where ground truth is unknown. Clarifying how reconstruction robustness is assessed without prior knowledge of spatial structure would help readers understand how the method could be practically deployed, particularly in more heterogeneous tissue contexts.

    3. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Liao et al. present SCOPE (Spatial reConstruction via Oligonucleotide Proximity Encoding), a method for reconstructing spatial organization from diffusion-defined DNA barcode interactions without the use of optical imaging. In SCOPE, hydrogel beads bearing unique DNA barcodes contain both "sender" and "receiver" oligonucleotides. Upon enzymatic release, sender oligos diffuse locally and hybridize to receiver oligos on neighboring beads, forming chimeric molecules that encode spatial proximity. Sequencing these products yields an interaction matrix, which is then used to reconstruct a spatial coordinate map.

      The authors demonstrate reconstruction of synthetic two-dimensional shapes, a large multicolor Snellen eye chart, and the interior surface of three-dimensional molds. The work expands the conceptual and experimental landscape of optics-free spatial sequencing.

      Thank you for this accurate summary of the work.

      Strengths:

      SCOPE employs bidirectional sender and receiver oligonucleotides on every bead, rather than using asymmetric transmitter-receiver architectures found in other diffusion-based methods. The symmetric design may improve detection sensitivity and reconstruction strategies, and represents a meaningful variation on optics-free spatial encoding.

      A notable strength of this study is the physical scale achieved. The authors reconstruct a Snellen chart spanning approximately 704 mm² and demonstrate molded 3D structures on the order of 75-100 mm³. Although some larger-scale warping is evident, and is discussed as potentially due to non-uniform diffusion, the relative local positioning across these large areas appears impressively accurate.

      The authors extend reconstruction beyond two-dimensional arrays to three-dimensional molded surfaces. This demonstrates that the assay and the computational methods for interpreting proximity graphs can support non-planar spatial relationships, expanding the scope of optics-free spatial inference.

      Thank you for highlighting these strengths of SCOPE.

      Weaknesses:

      Although the method is discussed in the context of spatial genomics and potential tissue applications, it is currently demonstrated only on engineered two-dimensional bead arrays and three-dimensional shapes fabricated in molds. It remains unclear how SCOPE would perform in heterogeneous biological environments, where diffusion may exhibit additional non-uniformities. A biological proof-of-concept, even limited in scope, would help define the method's strengths and limitations more clearly.

      We concur with the reviewer that a biological proof-of-concept is a key next step, and that diffusion will be more heterogeneous in this more complex environment. To this end, we are actively working to further develop SCOPE for use in tissue sections, with the goal of capturing transcriptomes, accessible chromatin, and genomes. As part of this work, we also hope to systematically explore a range of tissue permeabilization and tissue clearing approaches to mitigate the impact of heterogeneity on performance.

      The reconstruction of three-dimensional structures lacks strong sampling from volume interiors. This is speculated to be due to several possible factors; however, this limitation constrains the method to reconstruction of volume surfaces rather than comprehensive three-dimensional profiling.

      Thank you for highlighting this important limitation. The 3D reconstructions are indeed constrained by under sampling of volume interiors. We anticipate that this might be addressed via relatively minor adjustments to the protocol, e.g. using light or base-labile linkers to trigger oligo release, with the expectation that this will improve reaction consistency throughout the volume. However, even if we are unable to resolve this issue, we note that surface-resolved reconstructions may be useful for some goals, e.g. embedding a bead-packed gel within a tissue lumen, such as the gut. This could enable surface beads to capture RNA transcripts from adjacent cells, while bead–bead associations serve to define the surface topology.

      The reconstruction workflow involves multiple preprocessing steps and embedding choices. While these appear to work well for synthetic shapes with known geometry, it is less clear how parameter choices would be made in contexts where ground truth is unknown. Clarifying how reconstruction robustness is assessed without prior knowledge of spatial structure would help readers understand how the method could be practically deployed, particularly in more heterogeneous tissue contexts.

      Thank you for the opportunity to clarify. The computational pipeline used for 2D SCOPE reconstruction is designed to operate on a standardized input format and can be applied to arbitrary datasets without prior knowledge of spatial structure. For example, as shown in Figure 3, both the circle and “swoosh” geometries were reconstructed using the same algorithm and identical initial parameters. While certain hyper parameters are pre-specified (e.g. the number of k-nearest neighbors used to compute the pairwise distance matrix for UMAP), these are fixed across datasets. Other parameters, such as UMAP’s “min_dist,” are selected via an automated heuristic grid search that proceeds without user intervention. The agreement with ground truth in these controlled settings, together with the reproducibility of stochastic reconstructions (see Figure 3E-F), supports the robustness of the approach.

      Importantly, there was one exception. Reconstruction of the Snellen eye chart dataset required a manual step, involving an initial 3D UMAP embedding followed by a 2D projection to “flatten” the result. We suspect this reflects radial non-uniformities in sender/receiver oligo diffusion at larger spatial scales. Addressing such confounders algorithmically by explicitly modeling diffusion heterogeneity represents an important area for future work, with the goal of entirely eliminating the need for manual intervention.

      Finally, we note that these benchmark shapes represent somewhat contrived examples, and the geometries encountered in practice may often be much less complex. For example, in conventional spatial genomics, the geometry consists of a bead monolayer forming a flat, regular surface on a rectangular slide of known dimensions. Regardless of the tissue architecture overlaid on this surface, the reconstruction problem is defined by the bead monolayer itself, inferred through sender-receiver interactions.

      References

      Qian N, Li J, Yasser R, Yu M, Weinstein JA. 2026. Volumetric DNA microscopy for mapping spatial transcriptomes in three dimensions. Nat Protoc. doi:10.1038/s41596-025-01329-3

      Qian N, Weinstein JA. 2025. Spatial transcriptomic imaging of an intact organism using volumetric DNA microscopy. Nat Biotechnol 1–11.

    1. eLife Assessment

      This important study investigates how the brain categorizes written words from different writing systems (e.g., alphabetic vs. non-alphabetic), shedding potential light on the neural basis of language's social‑categorization function. Overall, the evidence supporting the authors' claims is solid, though some analyses and key interpretations would benefit from fuller justification.

    2. Reviewer #1 (Public review):

      Summary:

      This study demonstrates, through a series of EEG and MEG experiments, that the human brain automatically categorizes words from alphabetic and non-alphabetic languages, and it unpacks the neural mechanisms of this process from multiple angles. The work examines not only univariate repetition-suppression (RS) effects, but also how repeating or alternating languages influences the representational similarity of words within and across language categories.

      Strengths:

      The univariate RS effects across multiple experiments lend support to some of the main conclusions

      Weaknesses:

      I have reservations about the logic underlying the multivariate analyses, and I believe the implications of the control experiments merit fuller discussion.

      (1) Question 1: Logic of the multivariate analyses

      The original text states:

      "The processing of intra-language similarity was quantified as correlation distances between neural responses to two words of the same language, which occurred more frequently and would be inhibited in the Rep-Cond (vs. Alt-Cond) due to habituation (Fig. 1c)...".

      I argue that this passage conflates two levels. Building a representational dissimilarity matrix (RDM) is a data-analysis step; it cannot be equated with a cognitive computation. Hence, there is no sense in which this computation occurs "more frequently" in one condition. RDM construction rests on the pairwise similarity of activity patterns, so even if a task engaged no cognitive computation of representational similarity, we could still compute an RDM. Conversely, if a task factor alters the RDM, we must explain how that factor changes the underlying neural patterns, not claim that it triggers specific cognitive processing. Therefore, I neither understand what "more frequent processing" the authors refer to, nor accept their account of the multivariate results.

      The multivariate result pattern, briefly, is that distances between words, both within and across languages, are larger under the repetition condition. One plausible interpretation is that a word representation comprises two parts: language-type (alphabetic vs. non-alphabetic) and fine-grained identity features (visual shape, orthography, semantics, phonology, etc.). Repetition of language type may, via RS, reduce the weight of the first component, thereby increasing the relative contribution of fine-grained features and amplifying inter-word differences. This could explain the multivariate findings.

      (2) Question 2:

      For unlearned languages, people cannot distinguish lexical from sub-lexical levels. What, then, determines (i) the RS-effect difference between letters and radicals in familiar languages and words in unlearned ones, and (ii) the similarity of repetition effects between words in unlearned and familiar languages? An explicit account is needed.

    3. Reviewer #2 (Public review):

      Summary:

      This study investigates how the human brain categorizes visual words from distinct writing systems (alphabetic vs. non-alphabetic) as a neural basis for the social-categorization function of language. Using a repetition suppression paradigm combined with electroencephalography and magnetoencephalography, the authors conducted nine experiments with independent participants to identify the neural network underlying language-based categorization, characterize its temporal dynamics, and test whether this process operates independently of linguistic properties such as semantic meaning and pronunciation.

      Strengths:

      (1) The study employs a well-validated design with clear control conditions and systematically manipulates key variables, including writing system, language familiarity, and native language background. The use of nine experiments with independent participant samples strengthens the reliability and replicability of the results.

      (2) The work combines EEG and MEG, cross-validating findings across imaging modalities to support the reported neural effects. A combination of univariate, multivariate, and connectivity analyses is used to characterize neural responses and network interactions.

      (3) Results are consistent across multiple language groups and for both familiar and unfamiliar languages, supporting the generalizability of the identified neural mechanism beyond specific languages or prior experience.

      Weaknesses:

      The authors provide compelling evidence that the identified neural network supports the categorization of words by language, including computations of intra-language similarity and inter-language difference. However, the conceptual framing of this finding as directly reflecting the social-categorization function of language may be premature. While the task captures spontaneous language categorization, it does not involve social evaluation or intergroup processes. The connection to social categorization is inferred from prior literature rather than demonstrated within the current experimental design. Clarifying this distinction would strengthen the conceptual precision of the manuscript.

    4. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This study demonstrates, through a series of EEG and MEG experiments, that the human brain automatically categorizes words from alphabetic and non-alphabetic languages, and it unpacks the neural mechanisms of this process from multiple angles. The work examines not only univariate repetition-suppression (RS) effects, but also how repeating or alternating languages influences the representational similarity of words within and across language categories.

      Strengths:

      The univariate RS effects across multiple experiments lend support to some of the main conclusions

      Weaknesses:

      I have reservations about the logic underlying the multivariate analyses, and I believe the implications of the control experiments merit fuller discussion.

      (1) Question 1: Logic of the multivariate analyses

      The original text states:

      "The processing of intra-language similarity was quantified as correlation distances between neural responses to two words of the same language, which occurred more frequently and would be inhibited in the Rep-Cond (vs. Alt-Cond) due to habituation (Fig. 1c)...".

      I argue that this passage conflates two levels. Building a representational dissimilarity matrix (RDM) is a data-analysis step; it cannot be equated with a cognitive computation. Hence, there is no sense in which this computation occurs "more frequently" in one condition. RDM construction rests on the pairwise similarity of activity patterns, so even if a task engaged no cognitive computation of representational similarity, we could still compute an RDM. Conversely, if a task factor alters the RDM, we must explain how that factor changes the underlying neural patterns, not claim that it triggers specific cognitive processing. Therefore, I neither understand what "more frequent processing" the authors refer to, nor accept their account of the multivariate results.

      The multivariate result pattern, briefly, is that distances between words, both within and across languages, are larger under the repetition condition. One plausible interpretation is that a word representation comprises two parts: language-type (alphabetic vs. non-alphabetic) and fine-grained identity features (visual shape, orthography, semantics, phonology, etc.). Repetition of language type may, via RS, reduce the weight of the first component, thereby increasing the relative contribution of fine-grained features and amplifying inter-word differences. This could explain the multivariate findings.

      Thank you for these insightful comments regarding the logic of the multivariate analyses. In the revision, we will clarify that the multivariate analyses were conducted to assess correlation distances between neural responses to pairs of words, either within the same language or across different languages. The processing of intra-language similarity was assessed rather than defined by conducting the multivariate analyses. We will further elaborate the rationale underlying our experimental design, specifically why the processing of intra-language similarity is expected to occur more frequently in the repetition condition (Rep-Cond) than in the alternation condition (Alt-Cond).

      We also appreciate the alternative account of the observed neural repetition suppression (RS) effects in terms of language-type versus fine-grained identity feature processing. This perspective will be incorporated into the revised Discussion. In particular, we will outline the patterns of neural activity predicted by an account that assumes an increasing contribution of fine-grained features, and evaluate the extent to which our findings are consistent with these predictions.

      (2) Question 2:

      For unlearned languages, people cannot distinguish lexical from sub-lexical levels. What, then, determines (i) the RS-effect difference between letters and radicals in familiar languages and words in unlearned ones, and (ii) the similarity of repetition effects between words in unlearned and familiar languages? An explicit account is needed.

      Thank you for this helpful suggestion. In the revised manuscript, we will include a dedicated paragraph addressing these two issues. Specifically, we will provide a more precise account of the differences in repetition suppression (RS) effects between letters and radicals in familiar languages, as well as the similar RS effects observed for unlearned and familiar languages. These additions will help clarify the interpretation of the neural RS effects associated with visual word processing and strengthen the theoretical implications of our findings.

      Reviewer #2 (Public review):

      Summary:

      This study investigates how the human brain categorizes visual words from distinct writing systems (alphabetic vs. non-alphabetic) as a neural basis for the social-categorization function of language. Using a repetition suppression paradigm combined with electroencephalography and magnetoencephalography, the authors conducted nine experiments with independent participants to identify the neural network underlying language-based categorization, characterize its temporal dynamics, and test whether this process operates independently of linguistic properties such as semantic meaning and pronunciation.

      Strengths:

      (1) The study employs a well-validated design with clear control conditions and systematically manipulates key variables, including writing system, language familiarity, and native language background. The use of nine experiments with independent participant samples strengthens the reliability and replicability of the results.

      (2) The work combines EEG and MEG, cross-validating findings across imaging modalities to support the reported neural effects. A combination of univariate, multivariate, and connectivity analyses is used to characterize neural responses and network interactions.

      (3) Results are consistent across multiple language groups and for both familiar and unfamiliar languages, supporting the generalizability of the identified neural mechanism beyond specific languages or prior experience.

      Weaknesses:

      The authors provide compelling evidence that the identified neural network supports the categorization of words by language, including computations of intra-language similarity and inter-language difference. However, the conceptual framing of this finding as directly reflecting the social-categorization function of language may be premature. While the task captures spontaneous language categorization, it does not involve social evaluation or intergroup processes. The connection to social categorization is inferred from prior literature rather than demonstrated within the current experimental design. Clarifying this distinction would strengthen the conceptual precision of the manuscript.

      Thank you for raising this important point. In the revised Discussion, we will include an additional paragraph to clarify several related issues. First, prior research suggests that language can serve as a socially relevant category cue. Second, these findings imply that rapid categorization of words by language may occur in the human brain. Third, our results identify a neural network supporting such rapid language-based categorization but do not directly test how this process relates to social categorization. Highlighting these points will help delineate the scope of our findings and point to important directions for future research.

      We'll work on a revision of the manuscript and will submit the revision when it's ready.

    1. eLife Assessment

      This important study reports that an oncogenic population in an epithelium can either be repressed or spread, depending on the tissues. This is explained by hypothesising the existence of a heterotypic tension at the boundary between different cell types, and supported by pharmacological perturbations and numerical simulations using the vertex model. The solid study conveys a key message, although some uncertainty remains regarding the origin of the heterotypic tension in relation to acto-myosin organisation in the boundary cells.

    2. Reviewer #1 (Public review):

      Summary:

      The behaviour of cells expressing constitutively active HRas is examined in mosaic monolayers, both in MCF10a breast epithelial and Beas2b bronchial epithelial cell lines, mimicking the potential initial phase of development of carcinoma. Single HRas-positive cells are excluded from MCF10a but not Beas2b monolayers. Most interestingly, however, when in groups, these cells are not excluded, but rather sharply segregated within a MCF10a monolayer. In contrast, they freely mix with wt Beas2b cells. Biophysical analysis identifies high tension at heterotypic interfaces between HRas and wild-type cells as the likely reason for segregation of MCF10a cells. The hypothesis is supported experimentally, as myosin inhibition abolishes segregation. The probable reason for lack of segregation in the bronchial epithelium is to be found in the different intrinsic properties of these cells, which form a looser tissue with lower basal actomyosin activity. The behaviour of single cells and groups is recapitulated in a vortex model based on the principle of differential interfacial tension, under the condition of high heterotypic interfacial tension.

      Strengths:

      Despite being long recognized as a crucial event during cancer development, segregation of oncogenic cells has been a largely understudied question. This nice work addresses the mechanics of this phenomenon through a straightforward experimental design, applying the biophysical analytical approaches established in the field of morphogenesis. Comparison between two cell types provides some preliminary clues on the diversity of effects in various cancers.

      Weaknesses:

      Although not calling into question the main message of this study, there are a few issues that one may want to address:

      (1) One may be careful in interpreting the comparison between MCF10a and Beas2b cells as used in this study. The conditions may not necessarily be representative of the actual properties of breast and bronchial epithelia. How much of the epithelial organization is reconstituted under these experimental conditions remains to be established. This is particularly obvious for bronchial cells, which would need quite specific culture conditions to build a proper bronchial layer. In this study, they seemed to be on the verge of a mesenchymal phenotype (large gaps, huge protrusions, cells growing on top of each other, as mentioned in the manuscript).

      As an alternative to Beas2b, comparison of MCF10a with another cell line capable of more robust in vitro epithelial organization, but ideally with different adhesive and/or tensile properties, would be highly interesting, as it may narrow down the parameters involved in segregation of oncogenic cells.

      (2) While the seminal description of tissue properties based on interfacial tensions (Brodland 2002) is clearly key to interpreting these data, the actual "Differential Interfacial Tension Hypothesis" poses that segregation results from global differences, i.e., juxtaposition of two tissues displaying different intrinsic tensions. On the contrary, the results of the present work support a different scenario, where what counts is the actual difference in tension ALONG the tissue boundary, in other words, that segregation is driven by high HETEROTYPIC interfacial tension. This is an important distinction that should be clarified.

      (3) Related: The fact that actomyosin accumulates at the heterotypic interface is key here. It would be quite informative to better document the pattern of this accumulation, which is not clear enough from the images of the current manuscript: Are we talking about the actual interface between mutant and wt cells (membrane/cortex of heterotypic contacts)? Or is it more globally overactivated in the whole cell layer along the border? Some better images and some quantification would help.

      (4) In the case of Beas2b cells, mutant cells show higher actin than wt cells, while actin is, on the contrary, lower in mutant MCF10a cells (Figure 2b). Has this been taken into account in the model? It may be in line with the idea that HRas may have a different action on the two cell types, a possibility that would certainly be worth considering and discussing.

      Comments on revisions:

      There is still one last point that should be made even clearer:

      The system is being modelled based on the principle of INTERFACIAL TENSION, a description pioneered by the works of Steinberg and of Harris, and nicely conceptualized by Brodland (2002). Now the observed behaviour is a perfect case of sorting based on higher interfacial tension AT the boundary between cell types (with nice additional documentation of local actin and myosin enrichment in the revised manuscript). What needs to be made crystal clear it that this is NOT equivalent to the model of DITH ("DIFFERENTIAL INTERFACIAL TENSION HYPOTHESIS)" (Brodland 2002, Krieg et al 2008). It is important to stop using DITH in this context, as it leads to confusion and misinterpretations. Indeed, DITH predicts cell/tissue sorting based on differences in interfacial tension WITHIN the two cell types. While DITH accounts for relative POSITIONING (one tissue engulfing the other), it is now established that this is not the motor for cell sorting and tissue segregation, the key parameter is being heterotypic tension at the heterotypic interface. I thus invite the authors to avoid the terms "differential"/DITH, and rather use either "interfacial tension", or specifically to "HIGH HETEROTYPIC INTERFACIAL TENSION".

      Related: the authors correctly cite Canty et al NatComm2017 when discussing this phenomenon. I suggest to add an additional key supporting reference "D.M. Sussman, J.M. Schwarz, M.C. Marchetti, M.L. Manning, Soft yet sharp interfaces in a vertex model of confluent tissue, Phys. Rev. Letters 120 (2018) 058001". One may also include another pioneer work in Drosophila is "M. Aliee, J.C. Roper, K.P. Landsberg, C. Pentzold, T.J. Widmann, F. Julicher, C. Dahmann, Physical mechanisms shaping the Drosophila dorsoventral compartment boundary, Curr. Biol. 22 (2012) 967-976."

    3. Reviewer #2 (Public review):

      Summary:

      The authors investigate the behavior of oncogenic cells in mammary and bronchial epithelia. They observe that individual oncogenic cells are preferentially excluded from the mammary epithelium, but they remain integrated in the bronchial epithelium. They also observe that clusters of oncogenic cells form a compact cluster in mammary epithelium, but they disperse in the bronchial epithelium. The authors demonstrate experimentally and in the vertex model simulations that the difference in observed behavior is due to the differential tension between the mutant and wild-type cells due to a differential expression of actin and myosin.

      Strengths:

      * Very detailed analysis of experiments to systematically characterize and quantify differences between mammary and bronchial epithelia

      * Detailed comparison between the experiments and vertex model simulations to identify the differential cell line tension between the oncogenic and wild-type cells as one of the key parameters that are responsible for the different behavior of oncogenic cells in mammary and bronchial epithelia

      Weaknesses:

      * It is unclear what is the mechanistic origin of the shape-tension coupling, which is used in the vertex model, and how important that coupling is for the presented results. Authors claim that the shape-tension coupling is due to the anisotropic distribution of stress fibers when cells are under external stress. It is unclear why the stress fibers should affect an effective line tension on the cell boundaries and why the stress fibers should be sensitive to the magnitude of the internal isotropic cell pressure. In experiments, it makes sense that stress fibers form when cells are stretched. Similar stress fibers form when cytoskeleton or polymer networks are stretched. It is unclear why the stress fibers should be sensitive to the magnitude of internal isotropic cell pressure. If all the surrounding cells have the same internal pressure, then the cell would not be significantly deformed due to that pressure and stress fibers would not form. Authors should better justify the use of the shape-tension coupling in the model, since most of the observed behavior is already captured by the differential tension even if there is no shape-tension coupling.

      * The observed difference of shape indices between the interfacial and bulk cells in simulations in the absence of differential line tension is concerning. This suggests that either there are not enough statistics from the simulations or that something is wrong with the simulations. For all presented simulation results, the authors should repeat multiple simulations and then present both averages and standard deviations. This way it would be easier to determine whether the observed differences in simulations are statistically significant.

    4. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      (1) One may be careful in interpreting the comparison between MCF10a and Beas2b cells as used in this study. The conditions may not necessarily be representative of the actual properties of breast and bronchial epithelia. How much of the epithelial organization is reconstituted under these experimental conditions remains to be established. This is particularly obvious for bronchial cells, which would need quite specific culture conditions to build a proper bronchial layer. In this study, they seemed to be on the verge of a mesenchymal phenotype (large gaps, huge protrusions, cells growing on top of each other, as mentioned in the manuscript).

      We thank the reviewer for this important point. We agree that our experimental conditions do not fully recapitulate the in vivo architecture of either breast or bronchial epithelia. As the reviewer points out, the two cell lines need typical culture conditions to grow in an in-vivo like architecture, such as acinar structures for mammary tissue, and a pseudostratified architecture for the bronchial tissue, and it certainly would be interesting to subject the cell lines in these organotypic architectures and study the fate of oncogenic mutant cells. However, this would be an independent study on its own and is out of the scope of the current manuscript. Here, we intend to compare these two well-established epithelial lines from mammary and bronchial epithelial tissues, with distinct intrinsic mechanical and organisational properties, in minimal culture conditions, and study how just the context of having two different sources of epithelial cells can change the fate of oncogenic cells present in the wild-type population. We have now also performed experiments with the MDCK cell line, which is not like the BEAS2B line, and has well-defined cell-cell adhesions [Supplementary figure. 4a], and epithelial morphology, and shown that the fate of HRasV<sup>12</sup> mutants is different here as well, as compared to the MCF10A cell line.

      (2) As an alternative to Beas2b, comparison of MCF10a with another cell line capable of more robust in vitro epithelial organization, but ideally with different adhesive and/or tensile properties, would be highly interesting, as it may narrow down the parameters involved in the segregation of oncogenic cells.

      We agree with the reviewer and in line with this suggestion, we have repeated the key experiments using Madin-Darby Canine Kidney (MDCK) cells, a well-established model epithelial cell line. Our results show that even though MDCK cells show significantly distinct properties compared to BEAS2B cells (MDCK being more epithelial like than BEAS2B), the dynamics of the HRasV<sup>12</sup> clusters in both these systems are similar [Supplementary figure. 4b], and distinctly different from the mammary epithelial cells (MCF10A). We did not observe the formation of an actin belt around HRasV<sup>12</sup> clusters in MDCK monolayers, which indeed forms in MCF10A monolayers. Additionally, in MDCK cells, the HRasV<sup>12</sup> mutant clusters are not under compaction or jamming, instead, they form protrusions similar to the ones seen in BEAS2B monolayers. These results solidify our hypothesis of tissue-specific differences in the mechanics of cancer initiation.

      (3) While the seminal description of tissue properties based on interfacial tensions (Brodland 2002) is clearly key to interpreting these data, the actual "Differential Interfacial Tension Hypothesis" poses that segregation results from global differences, i.e., juxtaposition of two tissues displaying different intrinsic tensions. On the contrary, the results of the present work support a different scenario, where what counts is the actual difference in tension ALONG the tissue boundary, in other words, that segregation is driven by high HETEROTYPIC interfacial tension. This is an important distinction that should be clarified.

      We thank the reviewer for this insightful comment. As correctly noted, Brodland’s 2002 work provided a foundational formulation of the Differential Interfacial Tension Hypothesis (DITH), which frames tissue organization in terms of effective interfacial tensions.

      While in its original form, DITH emphasised segregation as a consequence of global differences in the intrinsic (bulk) tensions of juxtaposed tissues, our results specifically show that segregation is determined by local interfacial mechanics between transformed- and host cells. These local interfacial dynamics, however, is related to global contractility of cells- From our experiments with blebbistatin, we have observed a loss in the efficiency of segregation upon reducing global contractility, consequently inhibiting the formation of the interfacial actomyosin belt, which serves as the source of the interfacial tension between healthy and mutant populations. Therefore, the differences in local interfacial mechanics stem from intrinsic global contractility of cells in discussion here.

      We have also clarified this distinction more clearly in the discussion and have explicitly stated that while DITH provided the foundation for conceptualizing tissue mechanics, our findings on transformed cell- healthy cell interactions specifically demonstrate that a higher efficiency of segregation is driven by high heterotypic interfacial tension at the tissue boundary.

      (4) Related: The fact that actomyosin accumulates at the heterotypic interface is key here. It would be quite informative to better document the pattern of this accumulation, which is not clear enough from the images of the current manuscript: Are we talking about the actual interface between mutant and wt cells (membrane/cortex of heterotypic contacts)? Or is it more globally overactivated in the whole cell layer along the border? Some better images and some quantification would help.

      We agree that a detailed visualisation of actomyosin distribution would strengthen our conclusions. We have now added a few more images of the interface to the Supplementary Data [Supplementary figure. 5], which show that cortical actin accumulates in individual cells, at the wild type cell-mutant cell interface, and actin levels go up in both wild type and mutant populations at the interface. This is also clear from the quantifications of different region of interests [Figure 2e], which is done by segmenting individual cells in these regions and quantifying actin intensity in each cell.

      (5) In the case of Beas2b cells, mutant cells show higher actin than wt cells, while actin is, on the contrary, lower in mutant MCF10a cells (Author response image 2). Has this been taken into account in the model? It may be in line with the idea that HRas may have a different action on the two cell types, a possibility that would certainly be worth considering and discussing.

      We thank the reviewer for raising this important point. While a direct experimental dissection of how HRasV<sup>12</sup> mutation affects actin levels in BEAS2B and MCF10A cells individually is beyond the scope of the present study, we do not rule out the possibility that a HRasV<sup>12</sup> mutation may exert cell-type-specific biochemical effects on actin regulation in these two epithelial systems.

      Although the difference in actin between the mutants and the wild-type cells has not been incorporated into the model presented in the manuscript, we have now shown how actin levels change in response to the interfacial tension formed between the mutant and wildtype cells by adding a mechanochemical feedback to the model. Rather than prescribing intrinsic differences in actin levels between mutant and wild-type cells, we asked whether the feedback between the actin cytoskeleton and mechanical stress alone is sufficient to generate the observed actin reorganization. To address this, we incorporate a mechanochemical feedback loop (MCFL-I), originally developed in our earlier work [35], into the vertex model framework. This feedback captures the experimentally observed coupling between cell shape, actomyosin organization, and mechanical stress (i.e., heterotypic interfacial tension), and has previously been shown to reproduce biologically realistic epithelial behaviours such as dynamic cell shapes and heterogeneous actomyosin distributions [35].

      In this framework, actin is not introduced as an explicit or intrinsic variable. Instead, changes in actomyosin organization emerge dynamically in response to mechanical stresses. Specifically, MCFL-I allows the preferred area and preferred perimeter of cells to evolve depending on cell shape and actomyosin binding, rather than remaining fixed. From these evolving parameters, we compute the normalized contractility, , which we interpret as a proxy for bulk actin, and normalized line tension which we interpret as a proxy for junctional actin. These normalized quantities provide size-independent measures of actomyosin organization across the tissue. 

      The equations for MCFL-I can be written as:

      Thus, with MCFLs, the vertex model does not have fixed 𝐴<sub>0</sub> and 𝑃<sub>0</sub>. The cells dynamically change these parameters depending on the vertex model dynamics. The constitutive relations for the and are given below [1]:

      Here, is the fraction of myosin bound to actin as a function of cell area 𝐴. This nonlinear dependence arises from the load or strain-dependent binding of myosin to actin, and is a model parameter which is proportional to the binding affinity of myosin to actin in the absence of any strain. We consider to the be the same for both mutant and wild-type . Importantly, both mutant and wild-type cells obey identical mechanochemical rules in the model. Differences in actin organization arise solely due to differences in mechanical stress generated by differential interfacial tension. Positive differential interfacial tension compresses mutant cells within clusters. This will lead to different and P<sub>0>/sub> across the monolayer via MCFL-I, and thus reduced bulk actin and increased junctional actin [Appendix figure. 4], consistent with experimental observations. Conversely, when differential interfacial tension is weak or negative, mutant and wild-type cells experience similar stresses, and the model predicts minimal differences in actin organization [Appendix figure. 5].

      Thus, while HRasV<sup>12</sup>-dependent biochemical effects may indeed differ between BEAS2B and MCF10A cells, our results demonstrate that mechanical interactions at mutant– wild-type interfaces are sufficient to generate distinct actin signatures in the two tissues, without invoking cell-type-specific actin regulation. We have added the details of the mechanochemical feedback loop in the model to the Appendix to emphasize that the model tests the sufficiency of mechanics-driven actin reorganization rather than excluding additional biochemical contributions. 

      Although it looks that even for Λ > 0 we see that the normalized line tension seems to be negative. This is however just an artefact of the colorbar limits we have used to compare with the Λ < 0 case. If we plot with different colorbar limits, we see that the interface has as shown in Author response image 1.

      Author response image 1.

      Reviewer #2 (Public review):

      (1) It is unclear what the mechanistic origin of the shape-tension coupling is, which is used in the vertex model, and how important that coupling is for the presented results. The authors claim that the shape-tension coupling is due to the anisotropic distribution of stress fibers when cells are under external stress. It is unclear why the stress fibers should affect an effective line tension on the cell boundaries and why the stress fibers should be sensitive to the magnitude of the internal isotropic cell pressure. In experiments, it makes sense that stress fibers form when cells are stretched. Similar stress fibers form when the cytoskeleton or polymer networks are stretched. It is unclear why the stress fibers should be sensitive to the magnitude of internal isotropic cell pressure. If all the surrounding cells have the same internal pressure, then the cell would not be significantly deformed due to that pressure, and stress fibers would not form. The authors should better justify the use of the shape-tension coupling in the model and also present simulation results without that coupling. I expect that most of the observed behavior is already captured by the differential tension, even if there is no shape-tension coupling.

      The reviewer is correct in stating that most of the observed behaviour is already captured by the differential tension, without the shape-tension coupling. However, the shape tension coupling has been used here in accordance with the experimental observation that the cells at the interface are aligned and elongated along the interface [Fig. 2h], which can not be captured without the shape-tension coupling. The difference between shape indices of cells at the interface and away from the boundary is plotted versus the interfacial tension in the case of no shape-tension coupling [Appendix figure 2]. The red dashed line represents the experimental value of the shape index difference. The blue line is the shape index difference between two randomly chosen groups of cells (half of the total number of cells in each group is taken). At zero line-tension, the difference in shape index between interface cells and cells away from the interface is same as that between randomly chosen groups of cells, which is expected since there should be no interface at zero line-tension. The no shape-tension data presented here are averaged over 19 seeds. Although the results without shape-tension coupling reaches experimental values at high enough differential tension [Appendix figure 3], a closer inspection of the simulation results show that the cells are just squeezed and are aligned perpendicular to the interface, which is contrary to what is seen in experiments [Fig. 2h].

      Calculating the average of the absolute value of the dot product of the nematic director and the interface edge for simulations with and without shape-tension coupling [Appendix figure 3] clearly shows that with shape-tension coupling, the cells align and elongate along the interface as is seen in experiment, given by an interface dot product value > 0.5 at high enough line-tension values. Further, shape-tension coupling or biased edge tension has been used before to model for cell elongation during embryo elongation [45] and here we use it as an active line-tension force, which elongates cells along the interface, in addition to the differential tension which is passive. This additional quantification of the alignment and elongation of cells along the interface will be added to the Appendix.

      (2) The observed difference of shape indices between the interfacial and bulk cells in simulations in the absence of differential line tension is concerning. This suggests that either there are not enough statistics from the simulations or that something is wrong with the simulations. For all presented simulation results, the authors should repeat multiple simulations and then present both averages and standard deviations. This way, it would be easier to determine whether the observed differences in simulations are statistically significant.

      The difference in shape indices between the interfacial and bulk cells in simulations has now been calculated over 11 different seed values. The observed differences in simulations, along with the standard deviations have been plotted in Figure 4b. This figure will be updated to include the standard deviations. The nonzero difference in shape index in the absence of differential line tension for low values of stress threshold is due to the shape-tension coupling acting even at low differential tension. Thus, a non-zero, sufficiently high value of the stress threshold is required in our model with shape-tension coupling. This has also been stated in section 4 of the paper. The importance of the shape-tension coupling has been stated in response to the previous point.

      (3) The authors should also analyze the cell line tension data in simulations and make a comparison with experiments.

      The line tension for each edge can be calculated as .

      Although the line tension distributions look similar to the ones obtained from Bayesian Force Inference, a better comparison is between the normalized line tension and actin seen in experiment as we have discussed under point (4) asked by Reviewer 1.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      (1) The authors claim that the negative tension Lambda<0 resembles the Beas2b phenotype. This is not consistent with the expression of actin in Figure 2f, which seems very similar in all four regions of interest (ROIs). Also, the segregation index data for Beas2b in Figure 1h looks very different from the demixing parameter in Figure 4f for the negative value of Lambda.

      In the model presented in the previous version of the manuscript, actin differences have not been incorporated. We have only added an interfacial line tension, which might arise only at the interface between cells. In response to comment (4) from Reviewer 1, we have considered a vertex model with mechanochemical feedback and interfacial line tension to understand how actin distribution in the tissue is affected by interfacial tension. The results presented match very well with experimental images.

      The reviewer has rightly pointed out that the segregation index (SI) data presented in Fig. 1h have a different trend compared to those in Fig. 4f. However, it is essential to note that in the simulation, the initial condition is one in which the mutant cluster is already fully segregated, and thus, at the initial time point. This is not the case in experiments, and at initial time points. Thus, the two plots are not directly comparable and only show how SI changes in our simulations. It is more effective to compare the final time points in Fig. 2f with those in Fig. 4e, where we observe that Mcf10a has a higher SI compared to Beas2b, and the case with Λ > 0 has a higher SI than the case with Λ < 0. This supports our claim that Λ < 0 resembles the Beas2b phenotype and Λ > 0 resembles the Mcf10a phenotype.

      (2) It is unclear how the threshold pressure Pi_0 is implemented for the shape-tension coupling in the vertex model. Is the value of the additional tension gamma_ij equal to 0 if the internal pressure is below that threshold?

      The stress threshold is implemented for the shape-tension in the vertex model in the following way. The line tension forces can be written as:

      where, and . If the stress on the cell is below the threshold, then for those cells.

      (3) In vertex model simulations, the authors use identical parameters for wild-type and mutant cells. This does not seem to be consistent with experimental observations in Figure 2, where the expression of actin is different, and also, cell shape indices are different for the wild-type and mutant cells. The authors should comment on how that choice affects their simulation results.

      We thank the reviewer for this comment. As noted in our response to comment 4 from  reviewer 1, we have now attempted our simulations after adding a mechanochemical feedback to the model. Here, both wild-type and mutant cells follow identical mechanochemical rules within the vertex model. This choice does not imply that the cells are mechanically identical in the tissue; rather, it allows us to test whether differences in cell shape and actin organization can emerge purely from mechanical interactions.

      By incorporating the mechanochemical feedback loop (MCFL-I), the model captures how heterotypic interfacial tension redistributes mechanical stresses between mutant and wild-type cells. These stresses lead to differences in cell area, perimeter, and shape, which are then translated via MCFL-I into distinct bulk and junctional actin signatures. Consequently, even though the intrinsic parameters are the same, the emergent mechanical environment reproduces the experimentally observed differences in actin intensity and cell shape indices (as shown in Figure 2).

      Thus, our approach demonstrates that the experimentally observed heterogeneity between mutant and wild-type cells can arise solely from interface-driven mechanical effects, without prescribing any cell-type-specific parameters in the model.

      (4) Also provide data for cell line tensions in the vertex model, which can then be compared with the experimental data in Figure 2. This is especially important because the differential cell line tension at the interface of mutants and wild-type cells seems to be playing a very important role.

      The cell tensions from the vertex model have been plotted in the response to main comment (3) from Reviewer 2. Since the interfacial tension has been included as an extra term in the vertex model by hand, it is not trivial to simply compare the line tensions from the vertex model to the experimental data. However, we can understand how the tensions are by looking at the normalised tension and normalised contractility plotted as a response to comment (4) from Reviewer 1. Those plots are from a vertex model with mechanochemical feedback and the plots match well with experimental actin images.

      (5) In Figure 2j, the authors should report the relative cell pressure and line tension for all four ROIs. The data is only shown for the wild-type cells and for mutants in clusters, even though the figure caption states that the data is presented for all four ROIs. It would also be useful to report the cell tension at the interface between the mutant cells and wild-type cells since this is the key parameter for the vertex model simulations.

      We agree and have updated the graph [Figure 2j].

      (6) The tangential motion of cells around oncogenic clusters only shows up towards the end of Supplementary Video 3. It is unclear whether this is a transient effect or whether this tangential motion would persist for a longer time.

      We thank the reviewer for raising this point. In our experiments, tangential cell motion in the wild type population along the boundary of oncogenic cluster consistently emerges as the oncogenic cluster becomes compacted. We have plotted tangential velocity in interfacial wild type cells over time (Supplementary Fig. 6b), and show that such a motion persist at the cluster-wild-type interface, until the end of time-lapse recordings in all cases. 

      (7) It is very awkward that the authors are representing an integral of the tangential velocity over different loops in Figures 3c and 4i. Thus, it is very hard to separate how much of the increase in the integrated velocity is due to larger loops and how much is due to changes in the average tangential velocity. Since different loops have different perimeters, it would have been better to report the average tangential velocity by dividing the integrated tangential velocity by the perimeter length of each loop. In the methods, the authors state that the concentric circles go from the center to a point twice the radius of the mutant cluster, but this is not consistent with the image in Figure 3c, where the concentric circles seem to go only to the boundary of the mutant cluster.

      We thank the reviewer for raising the point regarding the dependence of the loop-integrated tangential velocity on the perimeter length. While the circulation (loop-integrated tangential velocity) indeed scales with loop size, it increases with radius only if tangential velocity components are directionally coherent along the loop.

      In our data, concentric-loop analysis centered on mutant clusters reveals a systematic increase in tangential motion with radius, with the largest values occurring at the outermost loops corresponding to the cluster–tissue interface. In contrast, applying the identical analysis to randomly selected wild-type regions does not yield any monotonic increase with radius, despite the increasing perimeter of the loops, and instead shows fluctuations around zero. This control demonstrates that the observed increase around mutant clusters is not a trivial geometric consequence of larger loop size but reflects the emergence of coherent tangential motion specifically at the mutant cluster boundary.

      To further address the reviewer’s concern, we additionally computed the mean tangential velocity by normalizing the loop-integrated tangential velocity by the loop perimeter. As shown in Supplementary figure. 6a, this normalization preserves the same qualitative trend: tangential motion peaks near the periphery of mutant clusters, whereas no such trend is observed in wild-type regions. We therefore conclude that both metrics capture the same physical phenomenon: enhanced tangential cell motion localized to the mutant cluster boundary, consistent with the behavior observed in the time-lapse videos.

      Author response image 2.

      From simulation data

      (8) The authors should comment on how jamming and unjamming are related to shape indices because some readers may not be familiar with them.

      We have updated the same in the text of Results 2.

      (9) In the captions of Figure 3, the authors state that the bronchial epithelium gets kinetically arrested. This is not evident from the data in Figure 3d, where the velocity magnitude drops just a little bit for the bronchial epithelium, and it remains much higher compared to the mammary epithelium at long times.

      We agree with this comment, and that using the word, kinetically arrested, for Beas2b cells is misleading, since their motion is much higher, even after the initial drop. We have updated the text in the caption accordingly.

      (10) It is unclear why the authors have used the segregation index for analyzing experiments and the demixing parameter for analyzing simulations. Both parameters are trying to quantify the same thing, so it would have been better to use the same quantity for both experiments and simulations to enable easier comparison.

      We agree that using the same quantity for both experiments and simulation would enable easier comparison. Thus, we have replaced the demixing parameter with segregation index in Figure 4. 

      (11) It is unclear what experimental data were used for shape indices in Figure 4c. Was it the data from Mcf10a or Beas2b? It is also unclear which ROIs were used because different ROIs have very different shape indices in experiments, according to Figure 2e,f.

      We have used the experimental ∆(𝑆ℎ𝑎𝑝𝑒 𝑖𝑛𝑑𝑒𝑥) = 0.75, which is a rough estimate of the difference between the shape indices for ROI 2 (interface), and ROI 1, ROI 3 and ROI 4 (away from interface) from Fig. 2 e for MCFL10a. 

      (12) The authors find that the differences in shape indices are non-zero even for Lambda=0 for some threshold pressure parameters Pi_0 in Figure 4c. This should not happen because all the cells are identical in that case. This suggests that either there are not enough statistics from the simulations or that something is wrong with the simulations. How is this simulation data obtained? Is it from a single simulation, or is this averaged over a certain number of simulations? Authors should perform multiple simulations and report both the mean values and the standard deviation.

      We have addressed this in the response under main comments (1) and (2) from Reviewer 2.

      (13) It is unclear how the cell extrusion was simulated in the vertex model.

      Extrusion probability calculation: Simulations with just a single mutant cell were run for a range of differential interfacial line tension values (Λ = 0, 0.1, 0.4, 0.8, 1.2, 1.6) with shape tension coupling. The simulation was run till the area of the mutant cell fell below a threshold area = 0.1, after which we consider the mutant cell to be extruded. 9 different random initial seeds were run and analysed. Each seed gives a binary result – either extruded or not. This was used to calculate the extrusion probability. We have added this section to the Appendix.

      (14) The authors claim that HRas^V12 clusters in bronchial epithelium grew on top of one another, but it is not clear how this can be observed in Figure 2b or in any other Figure.

      We thank the reviewer for raising this point. Our original statement that cells were growing on top of each other was based on observations from the Z-stack images, which allowed us to resolve cell positions along the apico–basal axis. However, since these Zstack data are not included in the current manuscript, we agree that this claim cannot be directly supported by the figures shown. We have therefore removed this statement from the text and restricted our conclusions to what is directly supported by the presented data.

      (15) In the main text, the authors state that bronchial epithelial cells exhibited higher F-actin intensities compared to mammary bronchial cells, but this difference is not statistically significant according to Figure 5e.

      We agree with the reviewer and have thus changed the text because even though the Factin intensities seemed higher in bronchial epithelium visually, the difference was not statistically significant.

      (16) The definition of eccentricity is incorrect in the text. The authors state that the eccentricity is quantified as the ratio of the length of the minor axis to the major axis of an ellipse. According to this definition, the eccentricity would be 1 for a circle and not 0.

      We have updated the definition of eccentricity in the text to the correct one, including the correct equation.

      (17) It is unclear whether the active force F_act is used in the vertex model simulations. The active force is defined, but then its value is never specified. Note that the motility force is also an active force, so it is unclear why the motility and active forces were separated.

      In our model, the line tension force arising from the shape tension coupling is the active force. We agree that the motility force is also an active force, however, in the absence of any directional movement for instance, the homeostatic tissues in discussion here, we have discounted the role of motility force in our mode, presented here. 

      (18) The authors use inconsistent naming for different types of epithelia throughout the manuscript. Mcf10a cells are referred to as either mammary epithelium or breast epithelium, and Beas2b cells are referred to as either lung epithelium or bronchial epithelium. Because of the very broad spectrum of journal readers, it may not be obvious to all readers that different names refer to the same cell types.

      We have updated the text to keep the naming consistent throughout.

      (19) Many references to individual figure panels in the main text are incorrect. The authors should carefully check all the references to figures.

      We apologize for these errors. We have updated the incorrect references after carefully reviewing the entire manuscript.

      (20) In Figure 5, panel b is incorrectly labeled as d.

      We have corrected the same.

    1. eLife Assessment

      This fundamental work substantially advances our understanding of a major research question: whether collagen can be directly imaged with MRI. The evidence supporting the conclusion is compelling, with methods, data, and analyses that are more rigorous than those currently considered state-of-the-art. The work will be of high interest to MR physicists and clinicians, as collagen is the most abundant protein in the human body and plays an essential role in health.

    2. Reviewer #1 (Public review):

      Summary:

      The aim of this work is to directly image collagen in tissue using a new MRI method with positive contrast. The work presents a new MRI method that allows very short, powerful radio frequency (RF) pulses and very short switching times between transmission and reception of radio frequency signals.

      Strengths:

      The experiments with and without removal of 1H hydrogen, which is not firmly bound to collagen, on tissue samples from tendons and bones are very well suited to prove the detection of direct hydrogen signals from collagen. The new method has great potential value in medicine, as it allows for better investigation of ageing processes and many degenerative diseases in which functional tissue is replaced by connective tissue (collagen).

      Comments on revisions:

      All points of criticism in the reviews were answered very well and led to further improvement of the article.

    3. Reviewer #2 (Public review):

      Summary:

      This work presents direct magnetic resonance imaging (MRI) of collagen, which is not possible with conventional MRI or other tomographic imaging modalities.

      Strengths:

      The experimental work is impressive, and the presentation of results is clear and convincing.

    4. Reviewer #3 (Public review):

      The paper is well written and well presented. The topic is important, and its significance is explained succinctly and accurately. I am only capable of reviewing the clinical aspects of this work which is very largely technical in nature. Several clinical points are worth considering:

      (1) Tendons typically display large magic angle effects as a result of their highly ordered collagen structure (cortical bone much less so) and so it would have been of interest to know what orientation the tendons had to B 0 (in vitro and in vivo). This could affect the signal level at the longer echo time and thus the signal on the subtracted images.

      (2) The in vivo transverse image looks about mid-forearm where tendons are not prominent. A transverse image of the lower forearm where there is an abundance of tendons might have been preferable.

      (3) The in vivo images show the interosseous membrane as high signal on both the shorter and longer TE images. The structure contains ordered collagen with fibres at different oblique angles to the radius and ulnar and thus potentially to B 0. Collagen fibres may have been at an orientation towards the magic angle and this may account for the high signal on the longer TE image, and the low signal on the subtracted image.

      (4) Some of the signals attributed to muscle may be from an attachment of the muscle to aponeurosis.

      (5) There is significant collagen in subcutaneous tissues so the designation "skin" may more correctly be "skin and subcutaneous tissue".

      (6) Cortical bone is very heterogeneous with boundaries between hard bone and soft tissue with significant susceptibility differences between the two across a small distance. This might be another mechanism for ultrashort T 2 * tissue values in addition to the presence of collagen. The two effects might be distinguished by also including a longer TE spin echo acquisition.

      Solid cortical bone may also have an ultrashort T 2 * in its own right.

      (7) It may be worth noting that in disease T 2 * may be increased. As a result, the subtraction image may make abnormal tissue less obvious than normal tissue. Magic angle effects may also produce this appearance.

      (8) It may be worth distinguishing fibrous connective tissue (loose or dense) which may be normal or abnormal, from fibrosis which is abnormal accumulation of fibrous connective tissue in damaged tissue. Fibrosis typically has a longer T 2 initially and decreases its T 2 * over time. In places, the context suggests that fibrous connective tissue may be more appropriate than fibrosis.

      Overall, the paper appears very well constructed and describes thoughtful and important work.

      Comments on revisions:

      The responses to my criticisms are well thought out and are fine as far as I am concerned.

      I suggest in Figure 5 line 6 changing "trabecular bone" to "trabecular bone marrow".

    5. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The aim of this work is to directly image collagen in tissue using a new MRI method with positive contrast. The work presents a new MRI method that allows very short, powerful radio frequency (RF) pulses and very short switching times between transmission and reception of radio frequency signals.

      Strengths:

      The experiments with and without the removal of 1H hydrogen, which is not firmly bound to collagen, on tissue samples from tendons and bones, are very well suited to prove the detection of direct hydrogen signals from collagen. The new method has great potential value in medicine, as it allows for better investigation of ageing processes and many degenerative diseases in which functional tissue is replaced by connective tissue (collagen).

      Weaknesses:

      It is clear that, due to the relatively long time intervals between RF excitation and signal readout, standard hardware in whole-body MRI systems can only be used to examine surrounding water and not hydrogen bound to collagen molecules.

      We agree that this is a regrettable situation (see also Discussion section). We are hoping that current and future efforts of MRI manufacturers towards improved hardware will eventually enable the technique for broader application.

      Reviewer #2 (Public review):

      Summary:

      This work presents direct magnetic resonance imaging (MRI) of collagen, which is not possible with conventional MRI or other tomographic imaging modalities.

      Strengths:

      The experimental work is impressive, and the presentation of results is clear and convincing. Through a series of thoughtfully prepared experiments, I found the evidence that the images reflect direct measurements of collagen to be highly compelling.

      Due to the technical demands, direct collagen imaging is unlikely to become widespread for routine clinical work, at least not anytime soon. That said, this work is nonetheless transformative and will likely be highly significant for research and perhaps clinical trials.

      Reviewer #3 (Public review):

      The paper is well written and well presented. The topic is important, and its significance is explained succinctly and accurately. I am only capable of reviewing the clinical aspects of this work, which is very largely technical in nature. Several clinical points are worth considering:

      (1) Tendons typically display large magic angle effects as a result of their highly ordered collagen structure (cortical bone much less so), and so it would have been of interest to know what orientation the tendons had to B 0 (in vitro and in vivo). This could affect the signal level at the longer echo time and thus the signal on the subtracted images.

      We have added arrows in the images showing the direction of the main magnetic field. For the in vivo case, the subject lay in the superman position, with B0 pointing from the hand towards the shoulder.

      (2) The in vivo transverse image looks about mid-forearm, where tendons are not prominent. A transverse image of the lower forearm, where there is an abundance of tendons, might have been preferable.

      We have added a distal view of the forearm, where more tendon structures are observed.

      (3) The in vivo images show the interosseous membrane as a high signal on both the shorter and longer TE images. The structure contains ordered collagen with fibres at different oblique angles to the radius and ulnar, and thus potentially to B 0. Collagen fibres may have been at an orientation towards the magic angle, and this may account for the high signal on the longer TE image and the low signal on the subtracted image.

      This is certainly an interesting take. While the magic angle effect is well established for collagen bound water, the orientation effects on the macromolecular collagen signal are still to be investigated. Our initial experiences so far suggest that the direct collagen signal is not as sensitive to orientation as the bound water.  

      Regarding the described observation for the interosseous membrane, we expect the high signal coming from collagen-bound water (yet not quite at the magic angle), which hardly decays between the two TEs, as their difference is small as compared to the T2* of this signal. Hence, this signal is removed in the subtraction image, and only the macromolecular collagen signal remains, which appears to be very low. Working with samples of the interosseus membrane may provide further insights into why this is the case.

      (4) Some of the signals attributed to the muscle may be from an attachment of the muscle to the aponeurosis.

      We have added the aponeurosis as a possible signal contributor in the muscle tissue.

      (5) There is significant collagen in subcutaneous tissues, so the designation "skin" may more correctly be "skin and subcutaneous tissue".

      We have updated the label accordingly.

      (6) Cortical bone is very heterogeneous, with boundaries between hard bone and soft tissue with significant susceptibility differences between the two across a small distance. This might be another mechanism for ultrashort T 2 * tissue values in addition to the presence of collagen. The two effects might be distinguished by also including a longer TE spin echo acquisition.

      Solid cortical bone may also have an ultrashort T 2 * in its own right.

      The described effect is clearly of importance for bone water but plays a negligible effect for the macromolecular signal. We would like to support this by a brief, coarse estimation. 𝑇<sub>2</sub>* can be approximated by 1/𝑇<sub>2</sub>* = 1/𝑇<sub>2</sub> + 1⁄𝑇<sub>2</sub>′, where 1⁄𝑇<sub>2</sub>′ \= 𝛾∆𝐵 = 𝛾∆𝜒𝐵<sub>0</sub> (Ref. 1).

      The susceptibilty difference reported for the interface between bone and water is ∆𝜒 = 2.5 ppm (Refs. 2 and 3), which at 3T leads to a 𝑇<sub>2</sub>′ ≈ 3000 𝜇𝑠. From our recorded FIDs, we use a 𝑇<sub>2</sub>* of 10 μs and thus obtain 𝑇<sub>2</sub> \= 10.03 𝜇𝑠.

      As can be seen, the change in the transverse relaxation constant due to susceptibility is negligible compared to the intrinsic decay of the macromolecular collagen signal. Notably, this is not the case for the pore water signal where T<sub>2</sub>s are on the order of milliseconds (Ref. 2).

      A footnote was added in the Introduction section regarding this topic.

      (7) It may be worth noting that in disease T 2 * may be increased. As a result, the subtraction image may make abnormal tissue less obvious than normal tissue. Magic angle effects may also produce this appearance.

      This is an important point regarding image interpretation. For this reason, it is advantageous that also the original anatomical images prior to subtraction are available, which will show such effects. They can be used in conjuction with the collagen-specific image to provide further insights regarding tissue disease. Increased T<sub>2</sub>* of diseased tissue has so far been reported for the bound water components due to a reduction of dipolar interactions between bound water and collagen (Ref. 4). A potential related change in T<sub>2</sub> for the macromolecular collagen component itself is certainly of interest and an avenue to explore in future work.

      (8) It may be worth distinguishing fibrous connective tissue (loose or dense), which may be normal or abnormal, from fibrosis, which is an abnormal accumulation of fibrous connective tissue in damaged tissue. Fibrosis typically has a longer T 2 initially and decreases its T 2 * over time. In places, the context suggests that fibrous connective tissue may be more appropriate than fibrosis.

      We are aware of this important distinction. We therefore checked the manuscript for references to fibrosis, making sure that the meaning is as intended.

      Overall, the paper appears very well constructed and describes thoughtful and important work.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) It should be stated that various methods with very short echo times (e.g. SWIFT by Garwood et al.) have been described in the past. This work shows for the first time that direct signals from collagen and be systematically detected in tissue samples.

      We have expanded a sentence in the introduction and reference selected publications studying short-T<sub>2</sub> water signal in collagen, including SWIFT.

      (2) It should be noted that the 1H atoms bound to collagen are located at different sites (at different amino acids of the protein) of the molecule and have different frequencies, and that further signal analyses are of interest.

      We have included additional information regarding distinct resonances of proton-binding sites of collagen in the introduction. The discrete observation of such signals requires advanced NMR methodology such as magic-angle spinning and RF decoupling, which is not a suitable approach for in vivo MRI. Without such methods, the broad lineshapes overlap strongly and are rather observed as a single decaying exponential with the dipolar oscillation as we observe in the FIDs.

      (3) Is it certain that the bump at 30 microseconds comes from 'dipolar coupling'? Is the development time probably too short for chemical shift-induced interference or J-coupling effects?

      30 microseconds is an extremely short interval to accumulate phase and requires large resonance offsets to observe significant changes. To investigate the nature of the bump, we also collected data on a Bruker 7T NMR spectrometer (see Author response image 1). Overall the same signal characteristics are observed as with 3T. In particular, the position of the bump is the same, excluding chemical shift as as source. However, with the higher field strength, chemical shift becomes significant for the signal phase, as observed by the change in the phase behavior at 50 microseconds, when the collagen component has decayed.

      While J-coupling is independent of field strength, the typical ranges are single-digit to tens of Hertz. In contrast, dipolar coupling interacts on the order of thousands of Hertz, which coincides with the values extracted from our signal model.

      To clarify this point, we extended the respective sentence in the Results section.

      Author response image 1.

      (4) It should be noted that short RF pulses have a relatively high energy content, and whether there are any particular stresses on patients during the examination (SAR, nerve stimulation?).

      SAR is an important issue in ZTE MRI. Since imaging bandwidths are large and excitation is performed with the imaging gradient being on, broadband pulses are necessary. Hence, significant RF deposition occurs and in vivo the flip angle can often not be optimized for the maximum signal, but will be limited by the SAR limit. We have added an explanation in the Discussion section.

      Peripheral nerve stimulation is generated by rapid switching of strong gradients. However, ZTE sequences are usually operated without switching gradients on and off, but with only minor adjustments of the gradient direction between TR intervals. Therefore, PNS is not a relevant issue.

      (5) In the Results section, Part B, 'substantial signal intensity' should be written instead of 'substantial image intensity'.

      We have changed this as suggested.

      References

      (1) Chavhan GB, Babyn PS, Thomas B, Shroff MM, Haacke EM. Principles, techniques, and applications of T2*-based MR imaging and its special applications. Radiographics. 2009 Sep-Oct;29(5):1433-49. doi: 10.1148/rg.295095034. PMID: 19755604; PMCID: PMC2799958.

      (2) Seifert, AC, Wehrli, SL, and Wehrli, FW (2015), Bi-component T<sub>2</sub>* analysis of bound and pore bone water fractions fails at high field strengths. NMR Biomed., 28, 861– 872. doi: 10.1002/nbm.3305.

      (3) Hopkins JA, Wehrli FW. Magnetic susceptibility measurement of insoluble solids by NMR: magnetic susceptibility of bone. Magn Reson Med. 1997 Apr;37(4):494-500. doi: 10.1002/mrm.1910370404. PMID: 9094070.

      (4) Loegering IF, Denning SC, Johnson KM, Liu F, Lee KS, Thelen DG. Ultrashort echo time (UTE) imaging reveals a shift in bound water that is sensitive to sub-clinical tendinopathy in older adults. Skeletal Radiol. 2021 Jan;50(1):107-113. doi: 10.1007/s00256-020-03538-1. Epub 2020 Jul 8. PMID: 32642791; PMCID: PMC7677198.

    1. eLife Assessment

      This is a useful study that seeks to elucidate the molecular mechanisms underlying spinal motor circuit assembly. The authors demonstrate that loss of Onecut transcription factors in spinal motor neurons affects the size and spatial distribution of pre-motor interneurons. However, the study in its current form is incomplete: the data and analyses do not fully support the main conclusion that Onecut acts through Neurotrophin-3 to regulate interneuron development in a non-cell autonomous manner. The work will be of broad interest to cell and developmental biologists.

    2. Reviewer #1 (Public review):

      Summary:

      In this study, Angla et al investigate the basis of observations made from previous studies where loss of Onecut (OC) transcription factors leads to changes in spinal interneuron populations that do not themselves normally express OC. The authors hypothesize that OC expression in spinal motor neurons has non-cell-autonomous effects on pre-motor interneuron (V1, V2a/b/c) population size and distribution. By knocking out OC in the motor neuron lineage (i.e., downstream of Olig2, a motor neuron progenitor marker gene), they indeed show that motor neuron-specific loss of OC expression decreases V2c interneuron number and alters the spatial distribution of V1, V2a, V2b, and V2c populations. Using bulk RNA-sequencing of WT and OC conditional knockout (cKO) motor neurons, the authors identify that the neurotrophic factor Ntf3 is downregulated by OC expression. They subsequently hypothesize that the non-cell-autonomous effects observed by loss of OC expression in motor neurons can be explained by de-repression of Ntf3. To test this, the authors conditionally knock out Ntf3 downstream of Olig2 and show that this leads to increased interneuron numbers and alters their spatial distribution, ultimately leading to dysregulation of spinal motor circuits and motor activity.

      Strengths:

      The authors use sophisticated genetic tools to precisely remove OC and Ntf3 expression in a lineage-specific manner and comprehensively assess the downstream effects across brachial, thoracic, lumbar levels of the spinal cord, as well as at two developmental timepoints, E12.5 and E14.5.

      Weaknesses:

      There are two main concerns that are not fully addressed:

      (1) Based on the effects observed with OC vs. Ntf3 cKO, it is unclear whether OC is indeed exerting its non-cell-autonomous effects via Ntf3. Knocking out both Ntf3 and OC and comparing the effects to those seen with just OC cKO alone could provide more insight on this point. Also, a quantitative summary of the effects of Ntf3 overexpression in motor neurons in the chick is lacking.

      (2) How the authors assess changes in the spatial distribution of interneurons is unclear. In Figures 2 and 4, the control distributions (despite reporting the same populations in the same regions) look different, suggesting large sample-to-sample variance in distribution. Although the authors report that several sections in each level were taken from at least three animals for each condition, it's unclear how variance within WT or cKO sections was accounted for in the final statistical evaluation. It seems at a glance that a comparison between control samples in Figure 2 and Figure 4 could report statistically significant differences, which would be problematic. A more rigorous report of sample-to-sample variance and a more in-depth explanation of the statistical methods are needed.

    3. Reviewer #2 (Public review):

      The study by Angla et al. proposes a model in which NT-3 produced by motor neurons regulates interneuron numbers and distribution in a non-cell autonomous manner. The authors demonstrate that ablation of motor neurons (MNs) and global and conditional deletion of OC transcription factors lead to changes in interneuron distribution. They identify that NT3 is upregulated after MN-specific OC deletion in RNA-seq experiments and show that olig2-cre mediated NT3 deletion leads to increased ventral interneuron numbers, altered distribution, and defects in locomotor behavior. The authors conclude that MN-derived NT3, under OC control, regulates interneuron development. While this is an intriguing hypothesis, additional experiments are needed to support it and strengthen the link between the different experiments described here.

      (1) The study primarily quantifies interneuron numbers and distribution at different levels of the spinal cord and under different genetic manipulations. Experimental details are lacking, defining how many sections were analyzed (several are noted in the methods) and how the rostrocaudal levels of the spinal cord were precisely aligned. In different figures, the values and distributions shown for controls vary quite a lot. For example, in Figure 2B vs Figure 4B, the number of FoxP2+ V1 neurons at brachial levels is ~350 vs 125. Similarly, the control distributions in 2I and 4I are quite different. This makes it challenging to determine whether the conclusions regarding the impact of each genetic manipulation on interneuron numbers and distribution are valid.

      (2) The relationship between OC and NT3 deletion data is not entirely clear. Both deletions presumably lead to changes in interneuron distribution, but is there any reverse relationship between the two that relates to relative changes in NT3 levels? The authors do not directly compare NT3 and OC KO IN distributions. Similarly, one might expect a decrease in interneuron numbers in OC mutants, which is only reported for V2c neurons. However, the image presented in Figure 2G shows an equal number of V2c INs in control and mutant.

      (3) It is not clear that the behavioral phenotypes seen in the olig2-cre mediated deletion of NT3 can be attributed to changes in interneuron development. How about a role of NT3 in oligodendrocytes? There is a big gap between the embryonic changes shown here and behavior, with no in-between circuit-level changes in locomotor circuits shown. A more restricted manipulation would be deleting TrkC from specific interneuron populations. Related to this, although TrkC is shown to be broadly expressed in ventral interneurons, it is not shown specifically to colocalize with any of the interneuron markers. The authors should validate that the receptor is expressed in the subsets that they are investigating.

      (4) The rationale for following up on NT3 seems to be the chick electroporation experiments; however, no changes in distribution are shown in those experiments, and only a very minor decrease in Chx10 interneurons. Shouldn't NT3 overexpression lead to substantial decreases in IN numbers according to the authors' model? The "data not shown", which presumably refers to distribution, would be important to show here, to further support this rationale.

      (5) The idea that NT3 downregulation causes an increase in IN numbers is not intuitive. Also, considering the DTA experiments in Figure 1, showing that MN ablation leads to a decrease in several IN subtypes and no changes in V2a neurons. It would be helpful for the reader if the authors could synthesize their results in the discussion and reconcile their experimental findings.

    4. Reviewer #3 (Public review):

      This manuscript aims to investigate cell extrinsic mechanisms that regulate the differentiation and distribution of interneuron types in the spinal cord. The authors demonstrate that the loss of motor neurons leads to changes in the number and distribution of different interneuron types, specifically V0v, V1, and V2b (but not V2a). The authors then hypothesize that this phenotype may be controlled by the action of Onecut (OC) transcription factors in motor neurons. Conditional knockout of OC1 + OC2 in motor neurons using Olig2-Cre, however, does not lead to significant changes in the numbers of V1, V2a, and V2b interneurons, although there is a change in their spatial distribution. While the authors do not check V0v neurons in OC mutants, they do check V2c, which show a reduction in number and change in distribution. Why the same neurons are not checked across experiments is unclear. The authors then analyze existing RNA-seq data to identify factors that could be mediating the effects of the OC factors in motor neurons. They identify Ntf3 as a candidate and confirm that it is upregulated in OC mutants. Conditional loss of function of Ntf3 (Olig2-Cre) leads to increases in V1, V2a, and V2b (but not V2c) interneurons and changes in the distribution of all four interneuron types. Finally, the authors demonstrate that these Ntf3 conditional mutants have major defects in motor function.

      The conclusions of this manuscript are not well supported by the data for the reasons listed below, making it difficult to assess the impact of this work on the field.

      (1) The manuscript relies heavily on quantifying numbers and the spatial distribution of interneuron populations. However, these do not seem to be consistent in control animals across experiments, making it difficult to interpret any changes observed in genetic manipulations. Specifically, in Figures 2 and 4, the same markers are being used to quantify V1, V2a, V2b, and V2c interneurons in controls vs. OC (Figure 2) or Ntf3 (Figure 4) conditional knockouts, but the numbers of neurons and their distribution in control animals are variable between these two figures. For example, there seems to be a mean of >300 V1 neurons in E12.5 brachial sections of Fig. 2 controls, but this number is <150 in Fig. 4 controls. The cell distribution scoring is similarly variable between these controls without any explanation. The same is true for E14.5 controls used in Figure S1 vs. Figure S3.

      (2) Neurotrophic factors generally promote neuronal survival. However, in this study, the loss of Ntf3 leads to increased numbers of interneurons. This finding is in disagreement with previous observations in slice cultures of spinal cords, as stated in the discussion. This discrepancy makes it even more important that the cell counts reported in the figures discussed above are robust.

      (3) The claim that phenotypes are non-cell autonomously driven by motor neurons is not well supported. In Olig2-Cre conditional knockouts of Onecut and Ntf3, there is no confirmation that the loss of these factors is specific to motor neurons. Therefore, it cannot be ruled out that other cell populations may be mediating the phenotypes.

      (4) The claim that interneuron development is regulated by OC control of Ntf3 expression in motor neurons is not well supported. The authors show that loss of OC1/2 leads to an increase in Ntf3 expression in motor neurons. If this pathway were controlling interneurons, loss of OC function and overexpression of Ntf3 would have the same phenotype, which is not the case. Additionally, it would also be expected that loss of OC function and loss of Ntf3 function would have inverse phenotypes, which is also not the case. The phenotypes from OC loss of function and Ntf3 loss of function seem distinct from one another. The authors state that too little and too much Ntf3 are both bad for interneuron development, but there is no data to support their claim that OC1/2 mutants have altered interneuron development because of higher Ntf3 expression.

      (5) It is not clear that interneurons being studied express the Ntf3 receptor TrkC, which makes it difficult to assess whether changes in Ntf3 signaling are directly responsible for the phenotype.

      (6) While the behavioral phenotypes are consistent with Ntf3 playing a role in motor circuits, there is no evidence to suggest that Ntf3's influence on premotor interneurons being studied is driving or contributing to this phenotype, as discussed by the authors.

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      (1) Based on the effects observed with OC vs. Ntf3 cKO, it is unclear whether OC is indeed exerting its non-cell-autonomous effects via Ntf3. Knocking out both Ntf3 and OC and comparing the effects to those seen with just OC cKO alone could provide more insight on this point.

      In this study, we did not intend to demonstrate that Onecut transcription factors exert their non-cell autonomous action on spinal interneuron development by regulating Ntf3 expression, and we do not state in the manuscript that this is the case. We only show that Onecut factors and Ntf3, the expression of which they regulate, contribute to the non-cell autonomous regulation of spinal interneuron development by the motor neurons. We are convinced that Onecut factors could regulate multiple independent factors and pathways involved in extrinsic regulation of interneuron development, as supported by the regulation of multiple secreted factor or membrane protein expression in motor neurons detected in the reported RNA-sequencing experiment (this manuscript and [1]). This possibly also includes, as demonstrated in cell culture for multiple homeoproteins including human Onecut factors [2], the intercellular transfer of the Onecut homeoproteins during spinal cord development, a process that we are currently investigating. Knocking out both OC and Ntf3 in the motor neurons, beyond being technically extremely challenging (1/64 probability to obtain triple-mutant embryos), would not enable to address this question, as it will simply results in the addition of two different defects.

      Also, a quantitative summary of the effects of Ntf3 overexpression in motor neurons in the chick is lacking.

      A quantitative summary of the effects of Ntf3 overexpression in the chicken embryonic spinal cord is provided in Figure S2.

      (2) How the authors assess changes in the spatial distribution of interneurons is unclear. In Figures 2 and 4, the control distributions (despite reporting the same populations in the same regions) look different, suggesting large sample-to-sample variance in distribution. Although the authors report that several sections in each level were taken from at least three animals for each condition, it's unclear how variance within WT or cKO sections was accounted for in the final statistical evaluation. It seems at a glance that a comparison between control samples in Figure 2 and Figure 4 could report statistically significant differences, which would be problematic. A more rigorous report of sample-to-sample variance and a more in-depth explanation of the statistical methods are needed.

      The experimental procedure to analyze the spatial distribution of spinal interneurons at different stages of development is described in details in the “Statistical analyses” paragraph of the Materials and Methods section of the manuscript, and has been repeatedly used by ourselves [3,4] and by others (see for example [5-7]) to conduct similar analyses.

      We also noticed that the distribution of the different analyzed interneuron populations in the control embryos showed some differences between the cOc1Oc<sup>2-/-</sup> and the cNtf3<sup>-/-</sup> lines. Several parameters can account for this observation. First, this study has been conducted over a period of 15 years, different investigators each contributing to different steps of the analysis. Second, the genetic background of these two lines is not identical, impacting both the duration of the gestation (hence, the embryonic stage of the performed analyses, even if the embryos were collected on the same gestation day) and possibly the distribution of some interneuron populations. Third, because of evolutions in the availability of the primary antibodies used to label the interneuron populations of interest, the same antibodies were not used throughout the study, as stated in the Materials and Methods section, although the same antibody was used by the same investigator to label the same interneuron population in each mouse line at each developmental stage.

      A detailed description of the number of sections and embryos included in each analysis as well as the whole statistical workflow that was used for the distribution analyses, which takes into account variance within control or mutant samples, will be provided in the revised version of the manuscript.

      Reviewer #2 (Public review):

      (1) The study primarily quantifies interneuron numbers and distribution at different levels of the spinal cord and under different genetic manipulations. Experimental details are lacking, defining how many sections were analyzed (several are noted in the methods) and how the rostrocaudal levels of the spinal cord were precisely aligned.

      A detailed description of the number of sections and embryos included in each analysis as well as the whole statistical workflow that was used for the distribution analyses will be provided in the revised version of the manuscript. The rostrocaudal levels of the spinal cord were precisely aligned using the distribution of Foxp1 in the Lateral Motor Columns (LMCs) at brachial or lumbar levels of the spinal cord [8,9], which will also be indicated in the revised version.

      In different figures, the values and distributions shown for controls vary quite a lot. For example, in Figure 2B vs Figure 4B, the number of FoxP2+ V1 neurons at brachial levels is ~350 vs 125. Similarly, the control distributions in 2I and 4I are quite different. This makes it challenging to determine whether the conclusions regarding the impact of each genetic manipulation on interneuron numbers and distribution are valid.

      Multiple factors may explain these observations. First, this study spans a 15-year period, with different researchers contributing to various stages of the analysis. Second, the genetic backgrounds of the two mouse lines are not identical, affecting both gestation length (thus influencing the embryonic stage at which analyses were performed, even when embryos were collected on the same gestational day) and potentially the distribution of certain interneuron populations. Third, due to changes in the availability of primary antibodies used to label the targeted interneuron populations, the same antibodies were not consistently employed throughout the study as noted in the Materials and Methods section though each investigator used the same antibody for a given interneuron population and developmental stage within each mouse line.

      (2) The relationship between OC and NT3 deletion data is not entirely clear. Both deletions presumably lead to changes in interneuron distribution, but is there any reverse relationship between the two that relates to relative changes in NT3 levels? The authors do not directly compare NT3 and OC KO IN distributions. Similarly, one might expect a decrease in interneuron numbers in OC mutants, which is only reported for V2c neurons. However, the image presented in Figure 2G shows an equal number of V2c INs in control and mutant.

      This study was not designed to demonstrate that Onecut transcription factors influence spinal interneuron development in a non-cell-autonomous manner through Ntf3 regulation, nor do we claim this in the manuscript. Instead, we show that Onecut factors and Ntf3, whose expression they control contribute to the non-cell-autonomous regulation of spinal interneuron development by motor neurons. We believe Onecut factors may regulate multiple independent factors and pathways involved in the extrinsic control of interneuron development. For instance, as noted earlier [2], we observed intercellular transfer of Onecut homeoproteins during spinal cord development, suggesting alternative mechanisms for non-cell-autonomous regulation.

      The two mouse lines studied here consist, on the one side, in a combination of OC inactivation and Ntf3 increased expression, and, on the other side, in Ntf3 inactivation. Therefore, a reverse relationship between the changes in interneuron distribution is not expected. Furthermore, gain-of-function and loss-of-function experiments in mouse models frequently generate phenotypes that are not inverse to each other [10-13].

      (3) It is not clear that the behavioral phenotypes seen in the olig2-cre mediated deletion of NT3 can be attributed to changes in interneuron development. How about a role of NT3 in oligodendrocytes? There is a big gap between the embryonic changes shown here and behavior, with no in-between circuit-level changes in locomotor circuits shown.

      We agree, the motor behavior changes that we recorded in Ntf3 conditional mutant mice are, as stated, “consistent with the hypothesis that Ntf3 produced by MNs is required to generate locomotor circuits with properly coordinated activity” but do not demonstrate a direct causal relationship. However, investigating the intrinsic activity of the spinal locomotor circuits, independently from, for example, oligodendrocyte contribution may prove to be extremely challenging and was beyond the scope of this study. In addition, to our best knowledge, Ntf3 has not been shown to be expressed in healthy oligodendrocytes in vivo, and TrkC has not been reported to be displayed by these cells in the same conditions.

      A more restricted manipulation would be deleting TrkC from specific interneuron populations. Related to this, although TrkC is shown to be broadly expressed in ventral interneurons, it is not shown specifically to colocalize with any of the interneuron markers. The authors should validate that the receptor is expressed in the subsets that they are investigating.

      We agree, investigating the consequences of inactivating the TrkC receptor in specific interneuron populations would be extremely informative. However, this experiment is also very challenging to perform, as most of the driver lines available to target spinal interneuron populations additionally target multiple neuronal populations outside of the spinal cord that are also involved in the control of movements and could therefore induce confounding effects on motor behavior analyses [14-20].

      We thank the reviewer for suggesting to investigate in more details the interneuron populations that display TrkC receptors, this will be include in the revised version of the manuscript.

      (4) The rationale for following up on NT3 seems to be the chick electroporation experiments; however, no changes in distribution are shown in those experiments, and only a very minor decrease in Chx10 interneurons. Shouldn't NT3 overexpression lead to substantial decreases in IN numbers according to the authors' model? The "data not shown", which presumably refers to distribution, would be important to show here, to further support this rationale.

      Chicken spinal cord electroporation only enables to study spinal cord development in a limited time-window, given the high mortality rate observed after longer incubation. At the stage we collected the electroporated embryos for analyses, interneuron migration has barely been initiated, and distribution cannot be studied yet. Consistently, we are not aware of any report of interneuron distribution analysis in electroporated chicken embryonic spinal cord, as compared to mouse embryos [3-7].

      (5) The idea that NT3 downregulation causes an increase in IN numbers is not intuitive. Also, considering the DTA experiments in Figure 1, showing that MN ablation leads to a decrease in several IN subtypes and no changes in V2a neurons. It would be helpful for the reader if the authors could synthesize their results in the discussion and reconcile their experimental findings.

      We agree, this will be included in the revise version of the manuscript.

      Reviewer #3 (Public review):

      (1) The manuscript relies heavily on quantifying numbers and the spatial distribution of interneuron populations. However, these do not seem to be consistent in control animals across experiments, making it difficult to interpret any changes observed in genetic manipulations. Specifically, in Figures 2 and 4, the same markers are being used to quantify V1, V2a, V2b, and V2c interneurons in controls vs. OC (Figure 2) or Ntf3 (Figure 4) conditional knockouts, but the numbers of neurons and their distribution in control animals are variable between these two figures. For example, there seems to be a mean of >300 V1 neurons in E12.5 brachial sections of Fig. 2 controls, but this number is <150 in Fig. 4 controls. The cell distribution scoring is similarly variable between these controls without any explanation. The same is true for E14.5 controls used in Figure S1 vs. Figure S3.

      We indeed observed variations in the quantifications and distributions of the analyzed interneuron populations in control embryos between the cOc1/Oc2<sup>⁻/⁻</sup> and cNtf3<sup>⁻/⁻</sup> lines. Several factors may explain this discrepancy. First, the study was carried out over 15 years, with different investigators contributing to distinct stages of the analysis—meaning interneuron distribution was not assessed by the same researchers in both lines. Second, the genetic backgrounds of the two lines differ, affecting gestation length (and thus the embryonic stage at analysis, even when embryos were collected on the same gestational day) as well as potentially altering the distribution of certain interneuron populations. Third, changes in the availability of primary antibodies targeting the interneuron populations of interest led to inconsistencies in antibody use across the study, as detailed in the Materials and Methods section. However, each investigator consistently used the same antibody for a given interneuron population and developmental stage within each mouse line.

      (2) Neurotrophic factors generally promote neuronal survival. However, in this study, the loss of Ntf3 leads to increased numbers of interneurons. This finding is in disagreement with previous observations in slice cultures of spinal cords, as stated in the discussion. This discrepancy makes it even more important that the cell counts reported in the figures discussed above are robust.

      Considering that neurotrophic factors only support neuronal survival would strongly neglect their important function in neuronal differentiation, which has been broadly demonstrated. Severe immunotoxic ablation of motor neurons or anti-serum blockade of Ntf3 activity severely depleted inhibitory, but not excitatory, interneurons in a highly apoptotic-prone organotypic culture model of embryonic rat spinal cord slices, which was rescued by Ntf3 in the first model [21]. Opposite results were obtained in vivo by other researchers using mouse models lacking almost all MNs due to the elimination of skeletal muscles, where the number of spinal INs remained unaffected [22,23]. Combined to our results, these in vivo observations suggest that Ntf-3 is involved in interneuron differentiation rather in their survival. Consistently, Ntf3 has been shown to promote neuronal differentiation [24].

      (3) The claim that phenotypes are non-cell autonomously driven by motor neurons is not well supported. In Olig2-Cre conditional knockouts of Onecut and Ntf3, there is no confirmation that the loss of these factors is specific to motor neurons. Therefore, it cannot be ruled out that other cell populations may be mediating the phenotypes.

      Combined conditional inactivation of Oc1 and Oc2 has been reported in [1]. Conditional inactivation of Ntf3 only impacts motor neurons as it is the only cell population in the ventral spinal cord wherein this factor is produced (this study and [25-27]). Furthermore, Olig2-Cre has been shown to be active in motor neurons and in V3 interneurons (see for example [10]), which, for this reason, have not been studied in the frame of this project as stated in the manuscript.

      (4) The claim that interneuron development is regulated by OC control of Ntf3 expression in motor neurons is not well supported. The authors show that loss of OC1/2 leads to an increase in Ntf3 expression in motor neurons. If this pathway were controlling interneurons, loss of OC function and overexpression of Ntf3 would have the same phenotype, which is not the case. Additionally, it would also be expected that loss of OC function and loss of Ntf3 function would have inverse phenotypes, which is also not the case. The phenotypes from OC loss of function and Ntf3 loss of function seem distinct from one another. The authors state that too little and too much Ntf3 are both bad for interneuron development, but there is no data to support their claim that OC1/2 mutants have altered interneuron development because of higher Ntf3 expression.

      This study was not aimed at proving that Onecut transcription factors mediate their non-cell-autonomous effects on spinal interneuron development through Ntf3 regulation, nor do we make this claim in the manuscript. Rather, we demonstrate that Onecut factors and Ntf3, whose expression they control—participate in the non-cell-autonomous regulation of spinal interneuron development by motor neurons. We propose that Onecut factors likely modulate multiple independent factors and pathways involved in the extrinsic regulation of interneuron development, as evidenced by the regulation of various secreted factors and membrane proteins in motor neurons observed in our RNA-sequencing data (this study and [1]). This may also involve intercellular transfer of Onecut homeoproteins during spinal cord development—a mechanism previously shown in cell culture for several homeoproteins, including human Onecut factors [2] and which we are currently exploring.

      (5) It is not clear that interneurons being studied express the Ntf3 receptor TrkC, which makes it difficult to assess whether changes in Ntf3 signaling are directly responsible for the phenotype.

      Immunofluorescence experiment in Figure 3C shows that TrkC receptor is present in cell populations surrounding motor neurons at e12.5, a stage where only the pre-motor interneuron populations reported in the manuscript are present. However, we thank the reviewer for suggesting to investigate in more details the interneuron populations that display TrkC receptors, this will be include in the revised version of the manuscript.

      (6) While the behavioral phenotypes are consistent with Ntf3 playing a role in motor circuits, there is no evidence to suggest that Ntf3's influence on premotor interneurons being studied is driving or contributing to this phenotype, as discussed by the authors.

      We acknowledge that the motor behavior changes observed in Ntf3 conditional mutant mice—as noted—are “consistent with the hypothesis that MN-derived Ntf3 is necessary for the formation of locomotor circuits with properly coordinated activity,” but they do not establish a direct causal link. However, analyzing the intrinsic activity of spinal locomotor circuits was beyond the scope of this study.

      (1) Toch, M. et al. Onecut-dependent Nkx6.2 transcription factor expression is required for proper formation and activity of spinal locomotor circuits. Sci Rep 10, 996 (2020). https://doi.org/10.1038/s41598-020-57945-4

      (2) Lee, E. J. et al. Global Analysis of Intercellular Homeodomain Protein Transfer. Cell Rep 28, 712-722 e713 (2019). https://doi.org/10.1016/j.celrep.2019.06.056

      (3) Harris, A. et al. Onecut factors and Pou2f2 regulate the distribution of V2 interneurons in the mouse developing spinal cord. Front Cell Neurosci 13 (2019). https://doi.org/10.3389/fncel.2019.00184

      (4) Kabayiza, K. U. et al. The Onecut Transcription Factors Regulate Differentiation and Distribution of Dorsal Interneurons during Spinal Cord Development. Front Mol Neurosci 10, 157 (2017). https://doi.org/10.3389/fnmol.2017.00157

      (5) Deska-Gauthier, D. et al. Embryonic temporal-spatial delineation of excitatory spinal V3 interneuron diversity. Cell Rep 43, 113635 (2024). https://doi.org/10.1016/j.celrep.2023.113635

      (6) Bikoff, J. B. et al. Spinal Inhibitory Interneuron Diversity Delineates Variant Motor Microcircuits. Cell165, 207-219 (2016). https://doi.org/10.1016/j.cell.2016.01.027

      (7) Hayashi, M. et al. Graded Arrays of Spinal and Supraspinal V2a Interneuron Subtypes Underlie Forelimb and Hindlimb Motor Control. Neuron 97, 869-884 e865 (2018). https://doi.org/10.1016/j.neuron.2018.01.023

      (8) Rousso, D. L., Gaber, Z. B., Wellik, D., Morrisey, E. E. & Novitch, B. G. Coordinated actions of the forkhead protein Foxp1 and Hox proteins in the columnar organization of spinal motor neurons. Neuron59, 226-240 (2008). https://doi.org/10.1016/j.neuron.2008.06.025 [pii]

      (9) Roy, A. et al. Onecut transcription factors act upstream of Isl1 to regulate spinal motoneuron diversification. Development 139, 3109-3119 (2012). https://doi.org/10.1242/dev.078501

      (10) Debrulle, S. et al. Vsx1 and Chx10 paralogs sequentially secure V2 interneuron identity during spinal cord development. Cell Mol Life Sci 77, 4117-4131 (2020). https://doi.org/10.1007/s00018-019-03408-7

      (11) Brunklaus, A. et al. in Brain Vol. 145 3816-3831 (2022).

      (12) Scekic-Zahirovic, J. et al. in EMBO J Vol. 35 1077-1097 (2016).

      (13) Wong, J. C. in Epilepsy Curr Vol. 25 347-349 (2025).

      (14) Hafler, B. P., Choi, M. Y., Shivdasani, R. A. & Rowitch, D. H. Expression and function of Nkx6.3 in vertebrate hindbrain. Brain Res 1222, 42-50 (2008). https://doi.org/10.1016/j.brainres.2008.04.072 [pii]

      (15) Nardelli, J., Thiesson, D., Fujiwara, Y., Tsai, F. Y. & Orkin, S. H. Expression and genetic interaction of transcription factors GATA-2 and GATA-3 during development of the mouse central nervous system. Dev Biol 210, 305-321 (1999).

      (16) Bretzner, F. & Brownstone, R. M. in J Neurosci Vol. 33 14681-14692 (2013).

      (17) Chopek, J. W., Zhang, Y. & Brownstone, R. M. in J Neurophysiol Vol. 126 1978-1990 (2021).

      (18) Miyagi, S., Kato, H. & Okuda, A. in Cell Mol Life Sci Vol. 66 3675-3684 (2009).

      (19) French, C. A. et al. in Mol Psychiatry Vol. 24 447-462 (2019).

      (20) Khouri-Farah, N., Guo, Q., Perry, T. A., Dussault, R. & Li, J. Y. H. in Nat Neurosci Vol. 28 2022-2033 (2025).

      (21) Bechade, C., Mallecourt, C., Sedel, F., Vyas, S. & Triller, A. in J Neurosci Vol. 22 8779-8784 (2002).

      (22) Grieshammer, U., Lewandoski, M., Prevette, D., Oppenheim, R. W. & Martin, G. R. Muscle-specific cell ablation conditional upon Cre-mediated DNA recombination in transgenic mice leads to massive spinal and cranial motoneuron loss. Dev Biol 197, 234-247 (1998). https://doi.org/10.1006/dbio.1997.8859

      (24) Kablar, B. & Rudnicki, M. A. Development in the absence of skeletal muscle results in the sequential ablation of motor neurons from the spinal cord to the brain. Dev Biol 208, 93-109 (1999). https://doi.org/10.1006/dbio.1998.9184

      (25) Dutton, R., Yamada, T., Turnley, A., Bartlett, P. F. & Murphy, M. Regulation of spinal motoneuron differentiation by the combined action of Sonic hedgehog and neurotrophin 3. Clin Exp Pharmacol Physiol 26, 746-748 (1999). https://doi.org/10.1046/j.1440-1681.1999.03108.x

      (26) Buck, C. R., Seburn, K. L. & Cope, T. C. Neurotrophin expression by spinal motoneurons in adult and developing rats. J Comp Neurol 416, 309-318 (2000).

      (27) Henderson, C. E. et al. Neurotrophins promote motor neuron survival and are present in embryonic limb bud. Nature 363, 266-270 (1993). https://doi.org/10.1038/363266a0

      (28) Usui, N. et al. Role of motoneuron-derived neurotrophin 3 in survival and axonal projection of sensory neurons during neural circuit formation. Development 139, 1125-1132 (2012). https://doi.org/10.1242/dev.069997

    1. eLife Assessment

      This important paper provides novel information on the function of the Drosophila ryanodine receptor (RyR) during muscle development. The authors analyze the effects of a rare human mutation that causes myopathy that affects a conserved region of the gene. They present compelling evidence that this variant affects muscle function in flies. These results suggest that Drosophila can be used as a tool for screening additional variants.

      [Editors' note: this paper was reviewed by Review Commons.]

    2. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Zmojdzian et al. provide an analysis of ryanodine receptor (RyR) expression and function in Drosophila. They also use CRISPR to engineer into flies a RyR variant of unknown significance (VUS) found in a human myopathy patient and demonstrate that it is likely a pathogenic mutation. From studies of RyR expression in embryonic and larval stages, and effects of RyR knockdown or overexpression in various muscle groups, the authors show that, in addition to its known actions in calcium-dependent excitation-contraction coupling, RyR promotes myogenesis during development.

      The key conclusions of the paper are convincing. I do not have suggestions for necessary additional experimental work, and my comments are minor. One conclusion, that RyR dysfunction may be involved in aging, is stated in multiple places, sometimes speculatively but once very forcefully. The latter is in the final paragraph of the Discussion, which states RyR "plays an instrumental anti-aging role in differentiated striated muscle". This conclusion must be tempered, as even if RyR knockdown phenotypes resemble some of those seen in aging flies, the study does not examine aged flies, and there is no mechanistic analysis that might link the two. I assume the authors would prefer to modify that sentence than initiate work with aging flies to prove the assertion. Finally, the use of CRISPR to test a VUS is excellent and suggests a good way for testing of additional RyR variants in the future.

      Significance:

      The paper is significant in that RyR is known to be a critical protein in calcium-dependent excitation-contraction coupling but its role in developmental myogenesis is poorly studied. This study demonstrates that it is expressed during, and is important for, embryonic and larval myogenesis in the fly. RyR is also understudied in this valuable model organism, even though a P element-based mutant has been available since 2000. The mechanistic basis for the functional observations is not explored here but the work is well performed and will be of interest to investigators studying muscle development (my own field) and diseases caused by RyR mutations.

    3. Reviewer #2 (Public review):

      Summary:

      This paper presents data using the Drosophila model to analyze the effects of a rare human mutation in the gene encoding the ryanodine receptor (ryr). The authors present a nice, comprehensive phylogenetic analysis that shows the Drosophila version of Ryr to be most similar to human RYR2 and that the known "hot spots" for mutations in RYR2 coincide with highly conserved regions of the Drosophila Ryr. They characterize the functional effects of ryr knockdown and overexpression on both adult heart function and larval body wall muscle. They identified embryonic ryr expression in association with actin-stained muscle precursor cells and provide beautiful stains, which clearly showed that embryonic muscle cell development was disrupted in ryr mutants. In support of these findings, KD of Calmodulin in larva (an Ryr inhibitor) phenocopied Ryr OE. They recreated a human variant of unknown function (RyR1 p.Met4881Ile ) in the conserved region of the fly gene and tested the effect on larval muscle. Their data suggested that this variant was likely deleterious as it negatively affected most muscle parameters.

      Major comments:

      (1) Fig, 1 In G there is no data for the RNAi KD situation.

      (2) Fig. 2 Authors should include Diastolic Diameters; they mention dilated cardiomyopathy but don't show the dilation. The authors should also show staining in hearts with RYR OE and RNAi. It would be nice to have some kind of quantification of disorganized myofibrils.

      (3) To evaluate and reproduce the data on the larva muscle parameters the authors should provide more details on how sarcomere length was quantified in each larva (replicates, ROI size, etc). Similarly, how were # of nuclei quantified / normalized? Importantly for these measurements, did the authors know what the contraction state of the muscles were when fixed?

      (4) Fig. 3, Are RNAi and OE in the same background? I only see one control in the graphs for the RNAi line background.

      (5) Fig. 3 How VL3 length was determined needs more detail, the Zhang ref is not adequate.

      (6) In order to be able to evaluate the data, the statistical tests used should be cited in the figure legends along with what *, ** ,*** stand for (or just provide p values).

      Significance:

      The authors nicely characterized the role of Ryr in muscle development and function and recreated a human variant of unknown function (RyR1 p.Met4881Ile ) in the conserved region of the fly gene. Their data suggested that this variant was likely deleterious as it negatively affected most muscle parameters. This work supports a role for the fly model in testing potential human disease gene variants.

      Comments on Revised Version:

      The authors have very adequately addressed the points raised by all reviewers.

    4. Author response:

      General Statements

      We would like to extend our gratitude to all reviewers for their supportive feedback, which acknowledges our study as well performed and of interest to investigators studying muscle development and diseases and supporting a role for the fly model in testing potential human disease gene variants. We also thank the reviewers for their valuable critical comments. We carefully considered all of them and made additional experiments and suggested text amendments.

      We believe these modifications substantially improve the quality of our results and enhance general interest of our work.

      Point-by-point description of the revisions

      Reviewer #1:

      In this manuscript, Zmojdzian et al. provide an analysis of ryanodine receptor (RyR) expression and function in Drosophila. They also use CRISPR to engineer into flies a RyR variant of unknown significance (VUS) found in a human myopathy patient and demonstrate that it is likely a pathogenic mutation. From studies of RyR expression in embryonic and larval stages, and effects of RyR knockdown or overexpression in various muscle groups, the authors show that, in addition to its known actions in calcium-dependent excitation-contraction coupling, RyR promotes myogenesis during development.

      The key conclusions of the paper are convincing. I do not have suggestions for necessary additional experimental work, and my comments are minor. One conclusion, that RyR dysfunction may be involved in aging, is stated in multiple places, sometimes speculatively but once very forcefully. The latter is in the final paragraph of the Discussion, which states RyR "plays an instrumental anti-aging role in differentiated striated muscle". This conclusion must be tempered, as even if RyR knockdown phenotypes resemble some of those seen in aging flies, the study does not examine aged flies, and there is no mechanistic analysis that might link the two. I assume the authors would prefer to modify that sentence than initiate work with aging flies to prove the assertion.

      We thank the Reviewer for this comment and remove from the concluding sentence hypothetical anti-aging role of RyR. The modified sentence reads as follow:

      “To conclude, we report functional analysis of dRyR, the sole fruit fly RyR gene and show that in addition to ensuring contractile properties of differentiated striated muscle it plays a key pro-myogenic role during muscle development.”

      Finally, the use of CRISPR to test a VUS is excellent and suggests a good way for testing of additional RyR variants in the future.

      Minor comments:

      (1) Figure 1A: In the Introduction it is stated that non-mammalian vertebrates have two RyR genes, alpha and beta. In Fig. 1A, a single chicken and single frog gene are listed under names different than alpha or beta. The figure also focuses on RyR2 genes, yet the Introduction states that the non-mammalian vertebrate genes are homologous to RyR1 and RyR3 in mammals. The dichotomy between the text and the figure is confusing. Finally, the font used in Fig. 1A should be enlarged for better visibility.

      To avoid the dichotomy we modified our sentence concerning the non-mammalian vertebrate RYR genes in the Introduction section. As indicated, there are two RYR genes in chicken and frog, with one that shares homology with vertebrate RYR2 and is represented in the phylogenetic tree (Fig. 1A).  As requested by the reviewer, to ensure better visibility we enlarged the font in the revised Fig. 1A.

      (2) Figure 3G-I: IF to Kettin is used to reveal sarcomeres but is not mentioned in the text. This protein is not present in vertebrates (I believe) and may not be familiar to many readers. It should be described in the text when it is used.

      We are grateful for reminding us to provide information about Kettin, which represents the Drosophila counterpart of Titin. The following information has been added to the text on page 9: “ …which in turn correlated with shortening of Kettin/D-Titin-labelled sarcomeres…”

      (3) Figure S2: The panels are labelled E, F, G. They should be A-D, as is used in the text.

      In the revised version of Fig. S2 panel labels were amended and the panel E view enlarged. We also provide an additional control context (C57>LacZ).

      (4) The dRyR16 allele is used in Figure 5 and S4. It is described as a hypomorph in the text on page 12 but as a null in the legend to Figure 5. Do the authors actually mean "homozygous" in the legend? The difference should be clarified.

      The dRyR<sup>16</sup> allele has been previously described as hypomorph. Indeed, in the legend of Fig. 5 we by mistake describe it as a “null”. As suggested by the Reviewer we modify it to « homozygous ».

      (5) The Met codon that is mutated in the variant studied in Figure S5 and Figure 6 is position 488 in humans. It is referred to that way in the fly version also. Is that true, the actual amino acid number is identical in humans and flies? In Figure S5B, it might be worth showing the primary amino acid sequence surrounding Met488 to reveal the degree of local conservation (beyond the orange domain in that panel).

      To provide more information about the conservation we include to the revised Fig. S5 an alignment of amino acid sequence surrounding the human RYR1 4881 variant position, which corresponds to position 4971 in the Drosophila dRyR.

      Author response image 1 shows a snapshot from a larger portion of alignment encompassing variant mutation showing a high amino acids conservation around the variant position:

      Author response image 1.

      (6) At least two references cited in the text are not listed in the References section (Hadiatullah et al. and Nishimura et al.).

      We double check reference citation and two indicated positions are now listed in the References section.

      Reviewer #1 (Significance):

      The paper is significant in that RyR is known to be a critical protein in calcium-dependent excitationcontraction coupling but its role in developmental myogenesis is poorly studied. This study demonstrates that it is expressed during, and is important for, embryonic and larval myogenesis in the fly. RyR is also understudied in this valuable model organism, even though a P element-based mutant has been available since 2000. The mechanistic basis for the functional observations is not explored here but the work is well performed and will be of interest to investigators studying muscle development (my own field) and diseases caused by RyR mutations.

      To reinforce mechanistic/functional side of our studies we include to the revised Fig.5 a new panel G showing promyogenic role of another major cellular calcium regulator, ER calcium pump SERCA. The Lms targeted RNAi knockdown of SERCA leads to affected myotube growth resulting in a thin muscle fiber phenotype. This indicates that both dRyR-regulated cytosolic and SERCA-regulated ER store calcium levels are required to promote muscle development.

      Reviewer #2:

      Summary:

      This paper presents data using the Drosophila model to analyze the effects of a rare human mutation in the gene encoding the ryanodine receptor (ryr). The authors present a nice, comprehensive phylogenetic analysis that shows the Drosophila version of Ryr to be most similar to human RYR2 and that the known "hot spots" for mutations in RYR2 coincide with highly conserved regions of the Drosophila Ryr. They characterize the functional effects of ryr knockdown and overexpression on both adult heart function and larval body wall muscle. They identified embryonic ryr expression in association with actin-stained muscle precursor cells and provide beautiful stains, which clearly showed that embryonic muscle cell development was disrupted in ryr mutants. In support of these findings, KD of Calmodulin in larva (an Ryr inhibitor) phenocopied Ryr OE. They recreated a human variant of unknown function (RyR1 p.Met4881Ile ) in the conserved region of the fly gene and tested the effect on larval muscle. Their data suggested that this variant was likely deleterious as it negatively affected most muscle parameters. This work supports a role for the fly model in testing potential human disease gene variants.

      Major comments:

      (1) Fig, 1 In G there is no data for the RNAi KD situation.

      We are grateful to the Reviewer for pointing this out. We initially didn’t include these data because of large difference in crawling capacities of dRyR RNAi larvae. In the revised version of Fig. 1 we provide now dRyR-RNAi larva crawling data. Because of their inefficient crawling, the time scale in panel 1G was modified.

      (2) Fig. 2 Authors should include Diastolic Diameters; they mention dilated cardiomyopathy but don't show the dilation. The authors should also show staining in hearts with RYR OE and RNAi. It would be nice to have some kind of quantification of disorganized myofibrils.

      As requested, in the revised Fig. 2 we provide diastolic diameter measures. We also include systolic interval graph to show a full picture of cardiac parameters. We do not observe all signs of dilated cardiomyopathy in dRyR-RNAi context as there is systolic diameter increase but no significant change in diastolic diameter.

      We modify our comments in the text accordingly (page 7).

      “…As the diastolic diameter remained unchanged, we conclude that cardiac dRyR knockdown affects cardiac performance without causing dilated cardiomyopathy…”

      Regarding circular myofibrils pattern, we do not observe irregularity of myofibrils orientation but rather a fuzzy and less distinctive sarcomeric pattern that is difficult to quantify. We specify this in the figure 2 legend (page 8).

      “…circular fibers in Hand>dRyR RNAi (E) context showed a fuzzy pattern suggesting an affected sarcomeric organisation…”

      Author response image 2 shows the entire view of the cardiac tube in dRyRRNAi context (stained with phalloidin) in which in spite of less distinctive circular myofibrils no obvious differences with wt are observed.

      Author response image 2.

      (3) To evaluate and reproduce the data on the larva muscle parameters the authors should provide more details on how sarcomere length was quantified in each larva (replicates, ROI size, etc). Similarly, how were # of nuclei quantified / normalized? Importantly for these measurements, did the authors know what the contraction state of the muscles were when fixed?

      We add the requested information to the Materials and Methods section:

      “Muscle characteristics measurements:

      All analyses of muscle length and sarcomere size were performed on fixed larval muscle preparations in a relaxed state. Acquired confocal images were analysed in Fiji using the line tool. Analyze – Measure tool was then applied to obtain muscle length values and measurements were analysed with Prism. Sarcomere size and number were calculated using Analyze – Plot profile Fiji tool. The sarcomere size was measured between peaks corresponding to Z-disc (revealed with Z-line specific marker) on approximatively 100 µm of muscle length. Sarcomere measurements were then analysed with Prism.

      DAPI-stained nuclei were counted in Z-stacks of confocal views of VL3 larval muscle and data analysed with Prism. About 30 larval muscles from 6-8 larval filets were analysed for each measurement. »  Statistics

      All statistical analyses were performed using Prism (v9.5.1, GraphPad, Software, La Jolla, CA, USA). The t test was used to compare control to variant context and one-way ANOVA tests were used for comparisons with more than two datasets. Bar plot represent the mean and the standard deviation. On the figures, statistical comparisons of sample vs control are indicated as ****: P ≤ 0.0001; ***: P ≤ 0.001; **: P ≤ 0.01; *: P ≤ 0.05; ns > 0.05.

      (4) Fig. 3, Are RNAi and OE in the same background? I only see one control in the graphs for the RNAi line background.

      We agree and to avoid potential bias between the RNAi versus OE genetic contexts we provide now in the revised version of Fig. 3 an additional OE control (C57>lacZ).

      Thus, two controls, one for RNAi and one for OE contexts are now included.

      (5) Fig. 3 How VL3 length was determined needs more detail, the Zhang ref is not adequate.

      We are thanking the Reviewer for this comment and provide now more details about the method used to calculate VL3 length (new paragraph in Materials and Methods), see also our answer to point 3. Zhang et al. reference is in relation to the mitochondria pattern quantification.

      (6) In order to be able to evaluate the data, the statistical tests used should be cited in the figure legends along with what *, ** ,*** stand for (or just provide p values).

      We add now the information about the statistical tests to the Fig legends in addition to the specific paragraph in Materials and Methods section (answer to point 3).

      Minor comments:

      (1) Need more detail in the figures, e.g. add what colors go with which stain to the picture.

      We provide this information in the revised version of the figure legends

      (2) Page 13, (Fig. ?F, G).

      We apologize for this mistake and add the number - Fig. 5

      (3) Fig. 4 "partially co-localizing with actin".... this is confusing and probably an overstatement based on the staining pattern in a whole embryo and not on an optical section or a higher power image with a more restricted field of view.

      We agree and remove this statement from the Fig.4 legend.

      (4) Some of the graphs are a bit small, recommend reducing the statistical comparison brackets to straight lines, which eliminates a lot of white space and would allow the graphs to be enlarged.

      We increased the size of graphs in revised Fig. S2 and Fig.5.

      Reviewer #2 (Significance):

      The authors nicely characterized the role of Ryr in muscle development and function and recreated a human variant of unknown function (RyR1 p.Met4881Ile ) in the conserved region of the fly gene. Their data suggested that this variant was likely deleterious as it negatively affected most muscle parameters. This work supports a role for the fly model in testing potential human disease gene variants. The reviewers field of expertise is in Drosophila genetics and in the use of the fly as a model system for understanding the genetic networks contributing to muscle structure and function at the cellular level.

      Reviewer #3:

      Summary

      This paper examines the Drosophila Ryanodine Receptor (RyR or dRyR). Ryanodine receptors are enormous channel proteins that mediate calcium efflux from the endoplasmic reticulum and sarcoplasmic reticulum. One goal of the work is to describe salient developmental features of Drosophila RyR (i.e., where it localizes in the cell and how it contributes to muscle development and function) and to refine knowledge from prior reports. Many of the analyses toward that goal are well done; this reviewer especially likes the examination of how muscles develop (Fig. 5).

      Another goal is to compare this information with what is known about mammalian RyRs. There seems to be a lot in common between Drosophila and mammalian RyRs. The paper finishes by taking a human ryanodine receptor variant of unknown significance and generating the corresponding amino-acid substitution in Drosophila RyR. The substitution has some phenotypic consequences for fly coordination, so the authors conclude that the human variant is likely to be pathogenic.

      In terms of investigation, a refined description of RyR biology is welcome. Ryanodine receptors are critical contributors/mediators of intracellular calcium signaling processes. Understanding their properties can help to contextualize the results of studies where calcium dynamics are at play. This is true of for both Drosophila and non-Drosophila work. For this version of the paper, there are several statements that should be edited, both in terms of accuracy and in terms of reporting prior knowledge. Additionally, some experiments are missing controls or reagent verification. Importantly, the anti-RyR antibody needs supporting information regarding its specificity.

      Main Comments

      (1) The paper does not fully state what has been done before in terms of studying Drosophila ryanodine receptor expression. In comparing the work on ryanodine receptors in vertebrates versus Drosophila, the authors write, "By contrast, no systematic analyses have yet been performed to assess the expression of the sole Drosophila dRyR gene." I was a little surprised by this sentence, so I examined the literature. There are hundreds of Drosophila publications that mention the ryanodine receptor in some way, but they are not about gene expression . As stated, the sentence might depend on what the authors mean by "systematic analyses." Two early works are relevant here: the Hasan and Rosbash, 1992 paper and the Sullivan et al., 2000 paper. Both are cited in this study. And both of these early papers addressed RyR gene expression, so that fact should be acknowledged up front.

      We agree with the Reviewer that there is a large number of publications that mention Drosophila ryanodine receptor with two of them identified by the Reviewer that provide information about Drosophila RyR expression. We refer to both of them and follow Reviewer’s suggestion to further acknowledge their work. The modified sentence in the text reads as follow:

      “…in spite of early works by Hasan and Rosbash (1992) and Sullivan et al., (2000) no systematic analyses have yet been performed to assess the developmental expression pattern of the sole Drosophila dRyR gene…”

      Concerning “systematic analyses” we mean the analyses of dRyR expression at both transcripts and protein levels during embryonic development and in differentiated muscles.

      (2) (Related) I examined those two early papers to cross-check the extent of analysis done previously. The text of Hasan and Rosbash reports in situ examination of RyR transcript using a digoxigenin probe (though the online version of that 1992 paper seems to have left out the relevant mesodermal and muscle images referenced in the paper, in favor of duplicating Figure 5 three times - I emailed Development to alert them). More relevant, several experiments executed in the Sullivan paper agrees closely with the current paper. As such, it needs more complete referencing. The Sullivan paper showed short, round larvae in mutants (Fig. 1 of Sullivan); ubiquitous mRNA, strongly in muscle and mesoderm (Fig. 2 of Sullivan); impaired muscle function in mutants (Fig. 3 of Sullivan), and impaired larval heart rate (Fig. 4 of Sullivan).

      Sullivan et al. paper is indeed a reference paper for Drosophila RyR. Our data are however largely novel and/or substantially extending those reported by Sullivan. Notably, we show for the first time developmental dRyR protein expression pattern in embryos and in larval filets, we also analyse dRyR isoform transcripts expression and provide for the first time embryonic muscle phenotype analyses that shed light on so far under investigated developmental function of dRyR.

      We follow Reviewer’s suggestion and provide in the revised version additional citations of this work:

      “…attenuation of dRyR (C57>dRyR RNAi) led to a significantly reduced larva body length (Fig. 3B, M) compared to control (Fig. 3A, Q), an observation that correlates with previously observed (Sullivan et al., 2000) reduced body size of dRyR<sup>16</sup> mutant larvae…”.

      “…our data extend previous observations of affected muscle contractility in RyR mutants (Sullivan et al., 2000)…”

      “…Overall, observed dRyR loss-of-function heart phenotypes with a slow heart rate and increased arrhythmia correlate with impaired cardiac function in RyR mutant larvae (Sullivan et al., 2000)…”

      (3) Fig. 1B-D (antibody staining): There are puzzles with this experiment. The first is with the anti-Dlg channel. Dlg is a core component of the NMJ postsynaptic density, and the antibody reveals a bright cage of Dlg around the boutons. But with the muscle images in Figure 1B, there are no boutons apparent (unless they are so far out of focus as to be invisible).

      Indeed, Dlg also stains postsynaptic NMJs at the muscle surface. On the Fig. 1B showing more internal optical sections to reveal T tubules Dlg-positive NMJs are out of focus.

      The second question centers on the dRyR antibody. The results state, "We first tested the expression of dRYR at the protein level." This sentence appears immediately after the sentence for gene expression from point 1. Technically, this antibody will help determine protein localization, not gene expression. But more importantly, there is no supporting/verifying information about this guinea pig anti-dRYR antibody. The methods state that it was provided by Robert Scott from NIMH. But there is no accompanying citation, no information about the antigen used to raise the antibody, and no negative control (either mutant or RNAi) to show that the staining is specific. If this is a published anti dRyR antibody that already meets the standards of specificity, that should be made clear, and the citation should be given. But if not, the information and data about the production of the antibody and the testing of its quality needs to be shared.

      We apologize for this omitted citation. The anti-dRyR antibody has been previously described and its specificity tested in the article Gao et al., (2013). Corresponding author of this paper David J. Sandstrom left NIMH and anti-dRyR antibodies are currently curated by Rob Scott from Benjamin White’s lab at NIMH.

      He generously sent us sample of this antibody. We add this information to the Material and Methods section.

      (4) Fig. S1: Similar to the antibody, is there a negative control probe that does not reveal this expression pattern? There are any number of probes or secondary antibodies that non-specifically label Drosophila muscles in patterns just like this.

      We are confident that the HCR probes are working properly as they reveal dRyR transcripts expression that is consistent with dRyR protein expression pattern. In parallel they show differential expression in embryos.

      Author response image 3 shows the control HCR ISH experiment with a probe that detects Apterous transcripts (specific for a subset of embryonic muscles and not present in L3 larval muscles).

      Author response image 3.

      A comparison between Ap HCR (A, A’) and dRyR Ex23 HCR (E, E’) signals.

      Minor Comments

      (1) "Overall, observed dRYR loss-of-function heart phenotypes...are reminiscent of those associated with aging (Nishimura et al., 2010), indicating that dRyR RNAi-induced impairment of Ca2+ homeostasis contributes to cardiac aging..." The conclusion of the sentence does not logically follow from the first part. This is because the tests conducted here were on rhythm, not on calcium homeostasis and cardiac aging.

      So, the tests cannot definitively say anything about those latter phenotypes.

      To answer this reviewer’s coment we modify the concluding sentence as follow:

      “…We hypothesize that dRyR RNAi-induced impairment of Ca2+ homeostasis could contribute to cardiac aging, for which Drosophila is a recognized model (Nishimura et al., 2011).”

      (2) Fig. S2 (bar graph): "% of total" - Is this supposed to refer to the percentage of the total muscle area that is positive for ATP5a staining? That should be clarified.

      We provide clarification in the Fig.S2 legend. “% of total” means the percentage of the measured muscle area that is positive for ATP5a staining”.

      (3) Fig. 3M, should say length

      Done

      (4) Fig. 5A legend - See Sullivan; that paper concluded that RyR[16] was hypomorphic instead of null, based on RyR[16]/Df comparison to RyR[16]/RyR[16]. Intuitively, I agree; a lesion that rips out the start site would likely be null. The antibody could help with classifying the allele, depending on the part of RyR used as the antigen.

      The RyR<sup>16</sup> mutants were indeed described by Sullivan et al., as hypomorphic and not null. In the Fig. 5 legend we modify the comment to: “…homozygous dRyR<sup>16</sup> mutant embryo…”

      (5) Discussion: "This also suggests that all dRyR isoforms are collectively required for larval muscle function." That sentence does not logically follow the expression information. In order to test that idea, individual isoforms would need to be eliminated or knocked down.

      We agree with this comment and modify our sentence accordingly.

      “However, whether all dRyR isoforms are collectively required for larval muscle function requires further investigation.”

      Reviewer #3 (Significance):

      The idea that RyR is expressed in many kinds of muscle is put forth as a major conclusion. It is good that the authors report this fact, and the impacts on muscle development documented in Figure 5 are some of the best data in the paper. However, in terms of opening up a new understanding of RyR biology, the impact of this information seems modest. Prior Drosophila work and the work of others studying these channels show that ryanodine receptors are ubiquitous. The fact that there is only one Drosophila RyR gene would lead most scientists to hypothesize that it would be present on the ER surfaces of all kinds of tissues, including different types of muscle.Novel phenotypic information for Drosophila RyR is reported in the study, and this is good. But in terms of the model system, the strength of Drosophila is in using genetic combinations to make refined conclusions. That toolkit is not fully used here; therefore, the paper is mostly descriptive. This study is mostly a single-gene study (dRyR), with isolated exceptions, like Cam knockdown in Figure 5.

      To improve the functional/mechanistic aspect of the manuscript in the revised version we include to Fig.5 the analysis of myogenic role of additional calcium regulator: ER calcium pump SERCA.

    1. eLife Assessment

      This important study uses a tripartite transdiagnostic computational framework to distinguish depression-specific, anxiety-specific, and shared psychopathology dimensions, in their relationships to mood variability and mood reactivity to reward prediction errors across multiple large non-clinical cohorts and a clinical sample. The evidence is convincing overall because the study combines large samples, a well-characterized gambling task and in-depth computational and psychometric analyses, and it replicates the depression-specific association with blunted reward prediction error-sensitivity in a clinical sample. However, the anxiety-specific effects are less consistently supported across individual datasets, may be underpowered in the clinical cohort because of comorbidity, and some aspects of the factor-analytic, risk-attitude, and mediation analyses would benefit from clearer explanation. These findings advance a mechanistic account of how distinct symptom dimensions differentially shape reward-based mood updating and variability, providing a principled framework for future transdiagnostic modeling.

    2. Reviewer #1 (Public review):

      This is a very interesting paper. The research question is intriguing, allowing the authors to address commonly observed comorbidities between depression and anxiety and their dissociable and opposite relationship to mood fluctuations and sensitivity to reward prediction errors. The computational analyses are very in-depth, including many state-of-the-art checks and validations. Another strength is the inclusion of several large or very large samples, including a patient sample in addition to the general population sample.

      I have the following questions:

      (1) Factor analysis I found the hierarchical organization of the factors interesting. While this is a very common procedure in, for example, the field of intelligence (producing sub-scores and a general g factor), it is not yet very commonly used in the field of computational psychiatry (though it has been validated before for anxiety/depression, so it is used here with good reason). I was also impressed by the methodological depth. In particular, it was of note how thoroughly done it was (for example, repeating the EFA on the second half of the data set). I have one question though: is the sample size too small for the exploratory analyses, given the number of items? Given the stability across the half-split, I imagine it is not. Perhaps the authors could spell out how many items, what would be the recommended standard for a subject-to-item ratio, and comment on this. A very technical point, the authors should specify how they extracted the factor scores from the other data sets (is it using the Thurstone or Bartlett method)? From experience (though not doing a hierarchical factor analysis), Bartlett can be somewhat better compared to the default (Thurstone) - better as in the resulting factors more closely recapitulating the factor correlations in the original sample (and independence of responses of other participants in a sample for computing a person's factor score). Could you also comment on similarities or divergences in this hierarchical factor analysis approach from another one recently used transdiagnostically in Wise et al. (2026, Translational Psychiatry)?

      (2) Linking factors to task parameters As I understand it, the authors relate the orthogonalized depression/anxiety to task parameters (sensitivity to RPEs on mood and mood variations) using correlations. In order to have a better understanding of how this relates to other commonly used approaches, I would pose two questions:

      (i) What are the correlations when the full (non-orthogonalized) factor scores for depression and anxiety are used? Are the signs the same? (ii) What are the results when, instead of the independent correlations, the authors perform b_RPE ~ anxiety + depression (again using the non-orthogonalized factors)?

      I'm assuming all of these analyses should give the same results if the authors' hypothesis of opposing effects of anxiety and depression holds true.

      Minor comments:

      (1) The authors should write down when the data were collected for each study. This is because AI capabilities have massively increased since ~2020 in quite specific steps (with the public release of new AI models), meaning that AI is likely to have been able to do tasks and questionnaires without detection if data were collected recently.

      (2) The authors should include a statement in the methods section that checks for AI were done. If none yet, could you do any? Recent papers (Westwood, PNAS 2025; van der Stigchel PNAS, 2026) point to the risk since at least the release of o4-mini (used in the cited paper to create very human-like behaviour).

      (3) It would have been good to collect questionnaires of other, thought to be unrelated psychiatric traits, like compulsivity or schizophrenia symptoms, to check the specificity of the results, also under the assumption that higher scores on either of these skewed questionnaires can pick up individual differences in 'bad questionnaire completion'. The authors should comment on the absence of other questionnaires in the discussion in the limitations section.

      (4) The authors could include a more explicit sentence in the abstract stating that the anxiety result did not hold up in the clinical population.

    3. Reviewer #2 (Public review):

      Summary:

      Despite their common co-occurrence, depression and anxiety are known to alter mood fluctuations in opposite ways. Here, the authors aimed at distinguishing depression-specific from anxiety-specific from psychopathology-general effects of reward processing on mood fluctuations, focusing on reward prediction errors (RPEs), which are known to be linked to mood fluctuations. This mechanistic study aims at uncovering the process through which these psychopathologies are associated with mood modulations. The authors were able to appropriately test their hypothesis and obtained results corroborating their conclusions.

      This work provides a convincing demonstration of the relevance of computational psychiatry (Huys et al, 2016) and the use of decision neuroscience to shed light on the interplay of anxiety, depression, and mood.

      Strengths:

      The authors used a tripartite model to distinguish depression vs anxiety, as well as a computational model distinguishing reward expectation (EV in the model) from outcome processing through RPE, which are two sequential cognitive processes.

      The manuscript adequately addresses the concerns one would have regarding risk-attitudes and regarding referring to trending statistical results.

      Weaknesses:

      The sample size of the clinical sample (N=116) may not be sufficient to detect anxiety-specific effects due to the high rate of comorbid anxious depression. It would be beneficial to include the number of MDD vs GAD vs anxious depression diagnoses in the clinical population, as this would likely shine light on the power limitations.

    4. Reviewer #3 (Public review):

      Summary:

      In this submission, Wang and colleagues jointly examine the association between depression and anxiety symptoms and individuals' affective reactivity to reward prediction errors in Ruttledge et al.'s gambling paradigm. Taking a bifactor approach to anxiety and depression in several non-clinical (and one clinical sample), the authors find that anxiety-specific symptoms relate to over-reactivity of mood to reward prediction errors (RPEs) as well as heightened mood variability, while depression-specific symptoms relate to blunted mood sensitivity to RPEs. These depression- but not anxiety-specific relationships replicated in patient samples.

      Strengths:

      I was impressed that the data-driven, transdiagnostic approach employed by the authors uncovered specific relationships between anxiety and depression-specific factors and RPE reactivity in a well characterized task and computational model, especially in a non-clinical sample. This sheds new light on how these affective processes may be perturbed-and importantly, in different ways-by anxiety and depression symptoms. Likewise, the replication of the depression-specific finding (RPE hypo-reactivity) in a clinical sample was nice to see.

      Weaknesses:

      (1) While the anxiety- and depression-specific factors had differential effects on mood variability (Figure 2A-D) and RPE reactivity (Figure 2E-G) in all samples, such that the correlations between the two factors and these mood parameters were significantly different, the anxiety factor was not consistently (significantly) associated with either mood-related parameter across samples. However, the authors resolve anxiety-specific predictive effects when they collapse across datasets. While it is intuitive that achieving a larger effective sample size would afford the power necessary to detect such individual differences, this struck me as a major caveat for this set of results.

      (2) The authors observe associations between the 'common factor' of depression and anxiety and risk-attitude tendencies, presumably the alpha (exponent) parameter in a prospect theory-type subjective value model. But where is this analysis explained? (i.e. how was this model formulated and how were risk attitude parameters estimated?) And what is the interpretation of this finding - is there precedent for looking at risk attitudes in this task? And why would these predictive effects only be observed in relation to the common, but not unique, factors of anxiety and depression?

    1. eLife Assessment

      This valuable study addressed a key question in epilepsy research: whether the recordings of very fast oscillations in the brain (>250Hz, fast ripples) reflect underlying pathology or might be a property that emerges from a neuronal network at random. The strengths of the study are the importance of the question, the multiple methods, and the solid evidence. However, there are limitations to the methods that should be addressed.

    2. Reviewer #1 (Public review):

      Summary:

      This is a study utilizing several types of analyses (computational modeling, neuronal cultures, rodent epilepsy model, and human intracranial multi-scale recordings) to address a highly relevant conceptual question: Are fast ripples (FRs) distinct pathological entities or largely emergent products of stochastic spike clustering? The results can potentially reshape current approaches to incorporating fast ripples into the epilepsy surgery evaluation.

      Strengths:

      The conceptualization of fast ripples as potentially arising by chance is highly novel and builds effectively on questions raised in prior studies that have never been satisfactorily resolved.

      The integration across biological scales and models is a major strength. The state dependency analysis provides additional, strong support. The methodology and statistical approaches used are thoughtfully presented and rigorously applied.

      In particular, this paper provides a strong response to the findings from Gliske et al, Nat Commun 2018. This study utilized long-term data analysis to uncover low rates of FRs detected from most recording sites, suggesting spurious detections, although FRs were concentrated within seizure onset areas.

      Weaknesses:

      The authors clearly aimed to use a statistical rather than a mechanism-based approach in this work. However, the paper's framing of true fast ripples as oscillatory events with stochastic fast ripples considered as confounders does not take prior investigations into biological mechanisms, particularly prior studies that point to an important role for stochastic fast ripples in some contexts. Incorporating recognition of these mechanisms would strengthen the manuscript and provide a more complete and nuanced characterization.

      Some examples from the literature:

      Eissa et al, eNeuro 2016, a paper that closely parallels this manuscript but took a mechanistic rather than statistical approach, showed that fast ripples can arise from population paroxysmal depolarizations - a key feature of epileptiform discharges - as temporally clustered, jittered population firing, with FRs appearing in LFP or EEG due to summated postsynaptic potentials (which are slower than action potentials and can generate signals in the high gamma range).

      Foffani et al., 2007, Neuron, and Ibarz et al., 2010, J Neurosci, argue that FRs are pseudo-oscillations created by jittered neuronal populations in the setting of altered spike timing.

      Smith et al., 2020, Sci Rep, contrasts FR characteristics in different regimes, i.e., intact inhibition early in a seizure vs. implied collapse of inhibition after recruitment. Schlingloff et al., 2025, J Neurosci, reported analogous findings in an animal model.

      The computational model and subtraction approach provide a strong case for the random emergence of clustered activity in the high gamma band, given its assumptions. However, any such modeling effort needs to account for inhibitory activity, including impaired inhibitory function that is expected in epileptic brain regions, which has a strong modulating effect on excitatory firing and is thought to play a significant role in FR generation.

      The shuffling procedure aims to preserve the power spectrum but randomizes high frequency phase (>200 Hz). However, this procedure removes biologically meaningful spike timing correlations, as well as structured cross-frequency coupling. The subtraction method thus likely underestimates the incidence of structured "distinct" FRs, while perhaps overestimating "chance" FRs due to biologically infeasible activity, making the statement that most FRs are due to chance correlation too strong.

      The kainate findings underscore this point: the increase in the number of FR detections could be, as the authors state, an increase in chance clustering due to increased network excitability generally. However, the likelihood of a parallel increase in pathological FRs cannot be ruled out, given likely pro-epileptic alterations in spike timing and circuit function.

    3. Reviewer #2 (Public review):

      Summary:

      This paper asks an important question that has not been discussed much in the extensive literature on the High Frequency Oscillations (HFOs) that have been extensively studied in patients with epilepsy and experimental models of epilepsy. The question is whether the Fast Ripples (FRs), the HFOs in the 250-500 Hz frequency band, represent a pathological phenomenon or represent a physiological phenomenon that occurs in the healthy brain but happens to be more frequent in epileptic tissue. It is an important question that has not been systematically addressed until now. The authors conclude, from very extensive simulations, from extensive experimental animal studies (the systemic kianate model of epilepsy in rats), and from a modest amount of human data, that FRs occur in healthy brains as a result of the chance occurrence of bursts of action potentials, and that in epileptic tissue, their frequency of occurrence is approximately 30% higher than what is expected by chance. They conclude that FRs are not a separate phenomenon of epileptic tissue. This finding is reinforced by the recent findings of FRs in experimental models of Alzheimer's disease.

      Strengths:

      This is a valuable study because it asks an important and original question and because it evaluates it from several angles (simulation, tissue culture, experimental animals, and human patients). The simulations and the analyses of real data are performed very carefully and with original and solidly documented approaches, using extensive simulations and extensive data sets in the cultured cell data and in the in vivo experiments. The paper is clearly written and well-illustrated.

      Weaknesses:

      I found only one serious weakness in this study, but it is one that is of importance. Although the original work on FRs was done in an experimental model of epilepsy, the field really became prominent when ripples and fast ripples were found first in microelectrode recordings of epileptic patients and then in the intracerebral EEG of such patients. Numerous studies have been performed since then, with a valuable meta-analysis including 700 patients (Wang Z, Guo J, van 't Klooster M, Hoogteijling S, Jacobs J, Zijlmans M. Prognostic Value of Complete Resection of the High-Frequency Oscillation Area in Intracranial EEG: A Systematic Review and Meta-Analysis. Neurology. 2024 May 14;102(9). Although the consensus at this point is that FRs are not the ideal and totally specific marker of epileptic tissue that many thought it could be, FRs are nevertheless much more frequent in epileptic tissue than in non-epileptic tissue and are a solid biomarker. It is also well established that they are much more frequent in NREM sleep than in wakefulness, as reported in the original paper of Staba et al (Staba RJ, Wilson CL, Bragin A, Jhung D, Fried I, Engel J Jr. High-frequency oscillations recorded in human medial temporal lobe during sleep. Ann Neurol. 2004 Jul;56(1):108-15., not mentioned in this paper) and in the study of Bagshaw et al (2009). In this last paper, using SEEG in various brain regions, the average rate of FRs in NREM sleep is about 6 times that in wakefulness. In the paper by Staba, with microelectrodes in mesial temporal structures, it is about twice. As a separate issue, the paper of Fraucher et al (Frauscher B, von Ellenrieder N, Zelmann R, Rogers C, Nguyen DK, Kahane P, Dubeau F, Gotman J. High-Frequency Oscillations in the Normal Human Brain. Ann Neurol. 2018 Sep;84(3):374-385), which is not quoted, found that, in an extensive sample, non-epileptic human tissue sampled with SEEG generated extremely rare FRs (an average rate of 0.04/min/channel, i.e. 1 every 25 min).

      The results above are mentioned because they do not fit with the data provided in the present study: FRs are much more frequent in NREM sleep than in wakefulness in human epileptic patients, and they are much more frequent (not 30% more, but many hundreds of percent more) in epileptic tissue than in non-epileptic human tissue. The fundamental phenomenon of interest is, I believe, the FRs in epileptic patients. The animal experiments, tissue studies, and simulations are models to study the human phenomenon. With respect to the modulation by sleep and the differentiation between epileptic and non-epileptic tissue, it seems that the systems studied in this paper are not good models of the human condition. The human results presented in the study only reflect wakefulness recordings, which is not the condition in which most HFO studies have been done and in which most HFOs occur. The authors refer to the study of long-term fluctuations in HFO rates by Gliske et al. (2018) to say that one has to be careful with the results regarding sleep, for example, Bagshaw et al (2009), but the clear predominance in of HFOs in NREM sleep has been observed by many studies. The cautions regarding fluctuations over extended periods also apply to the awake human data analyzed in this study.

      The study's conclusions regarding the generation of FRs are therefore questionably applicable to the human condition. I do not dispute their validity for the models and situations in which they were studied.

    4. Reviewer #3 (Public review):

      Summary:

      An outstanding question in the field of high-frequency oscillations (HFOs) in the context of epilepsy is how these oscillations emerge, considering that they occur at such high frequencies, i.e., 250Hz, well above the firing ability of single neurons. One hypothesis that has been suggested in the past is that neurons that fire in an out-of-phase fashion, or rather at random intervals,s may contribute to a spectrum of HFOs ranging from 250-500Hz that are observed in epilepsy. However, how possible it is that random action potentials could aggregate to the extent that they could give rise to HFOs in the so-called fast ripple (FRs) frequency range (>200 according to the authors) remains unclear. To test this hypothesis, they used computational modeling to randomly insert action potentials in a signal, and they found that this approach is sufficient to generate FRs. Some of the predictors of whether FRs could occur were neuronal count, firing rate, and synchronization. Besides computational modeling, they used different model systems to test whether that would be possible to be observed in neuronal cultures, in epileptic rats (intrahippocampal kainic acid model), and human data. Neuronal cultures treated with picrotoxin did not show evidence that FRs could be generated beyond chance aggregation of action potentials. They then asked whether synchronization and firing rate could play a role in the emergence of FRs. They found that changes in neural firing and synchronization, such as those occurring during differences phase of the sleep-wake cycle, could affect the number of FRs occurring by chance aggregation, with more FRs seen during periods of wakefulness, a result that they replicated in human data.

      The authors largely achieve their proposed aims of demonstrating that random neuronal firing can, in principle, generate FRs. Results from this study could influence current thinking around mechanisms generating FRs in epilepsy. The use of different computational approaches and model systems could offer new analytical methodologies for the study of FRs in the context of brain disease.

      Strengths:

      (1) The authors used a multi-level approach combining computational modeling with experimental datasets, including neuronal cultures, a rat model of temporal lobe epilepsy, and human data.

      (2) Identification of key parameters such as neuronal count, firing rate, synchronization, and brain state in observed incidence of FRs generated through random aggregation of neural firing.

      (3) Cross-species validation increases the likelihood of generalizability of the findings.

      Weaknesses:

      (1) Some of the simulated FRs appear short in duration and may not meet standard detection and definition criteria, potentially influencing validity.

      (2) The neuronal culture approach does not directly test random insertion of action potentials, limiting interpretation.

      (3) Sleep is treated as a homogeneous state in the rat dataset, without accounting for stage-specific differences in synchronization, which may affect the results and interpretation.

      (4) The analyses conducted in human data lack direct comparison with sleep data.

    1. eLife Assessment

      This study uses convincing modeling methods and analyses of rich behavioral datasets to investigate the role of attention in value-based decision making; for instance, as when choosing between two snacks. The results are valuable, as they challenge existing theories that assume that paying attention to an available option biases the eventual choice toward that option. The results suggest that the correlation between attention and decision-making is formed largely after and not before the (internal) choice process has terminated, a finding that offers an intuitively appealing rethinking of how attention and decision-making processes interact during value-based choices.

    2. Reviewer #1 (Public review):

      Summary:

      This study examines whether gaze direction actively shapes choice during food preference decisions or whether gaze and choice evolve largely independently until the moment of commitment. The established framework in this context, the aDDM, assumes that gaze causally biases the accumulation of evidence in favour of the fixated item. The authors show convincingly that this model fails to fit key behavioural patterns across several datasets, as do other published models that make the same assumption. The authors propose an alternative model (Post-Decision-Gaze or PDG) in which gaze and decision formation are decoupled: gaze does not influence the decision process, nor is it drawn toward the ultimately chosen item, until after the decision threshold is reached. Only during the motor execution period (after commitment) is gaze directed to the chosen option. They demonstrate that this model fits several observed patterns better than the aDDM and related variants.

      Strengths:

      The work thoroughly considers multiple models and datasets. It advances an interesting alternative perspective on gaze-decision interactions and highlights meaningful shortcomings in existing models. The authors take the time to explain how modelling assumptions produce specific patterns in the data, which is certainly insightful to readers interested in the modelling of value-based decision making.

      Weaknesses:

      It is unclear to what extent the model's success relies on the way non-decision time is formalised in the model. In the proposed PDG model, non-decision time is decomposed into separate visual encoding, saccadic execution, and manual execution components. Several values (assumed or recovered) do not match known physiological or behavioural ranges. This is a common issue in the literature, and the authors may want to address it in light of broader work discussing what non-decision time consists of in both manual and saccadic actions (e.g., Bompas et al., 2024, Non decision time: the Higgs boson of decision, Psychological Review).

      In particular, the "saccadic execution" parameter appears far too long and too variable to reflect merely execution; instead, it likely includes decisional components. This would make more sense since manual and saccadic planning essentially rely on distinct brain areas, hence it seems unrealistic that crossing a single threshold would trigger both manual and saccadic execution. Similarly, recovered manual non-decision times are substantially longer (though not more variable) than expected motor execution durations for button presses. These patterns suggest that parts of what the model treats as non-decision time are likely decisional in nature, although perhaps related to "action decision" rather than the "value-based decision" of interest to the authors. To what extent these two processes neatly follow each other or overlap could be usefully considered.

    3. Reviewer #2 (Public review):

      Summary:

      Zylberberg et al. reanalyze eye-tracking and behavioral data (mostly from Krajbich et al., 2010) to test two predictions of the attentional Drift Diffusion Model, finding that these predictions are not met. Similarly, predictions of normative models (inspired by rational inattention) are not in line with the data, and the authors propose a post-choice model of attention. This model better accounts for the two effects but also does not account for all patterns, so the authors conclude that eye movements most likely reflect both pre- and post-decisional processes.

      Strengths:

      A clear strength is the systematic falsification-based approach of the paper, establishing (partially) new predictions and testing to what extent these are met by extant models and by a newly developed theory. The authors do a good job in providing intuitions behind the effects and the reasons why models such as the aDDM predict them. The paper is of substantial relevance for the field, as it shows that effects pertaining to the last fixation(s) should be interpreted with caution. Another strength is the paper's transparency as the authors clearly acknowledge that their new model does not do a perfect job either.

      Weaknesses:

      The paper focuses on analyzing the Krajbich 2010 data, but shows that the second effect replicates in many other datasets. A more principled approach, in which both effects are analyzed and presented for all datasets, would be more convincing. The results should then be shown together for clarity/readability.

      Similarly, it would be nice to show to what extent the models' predictions depend (not depend) on using the best-fitting parameter values (are there any parameter settings under which the two effects are not predicted?)

    4. Reviewer #3 (Public review):

      Summary:

      In this study, the authors reanalyzed choice, RT and gaze datasets collected from human subjects performing a food-choice task. They show that models that posit a causal role for attention in shaping the decision-making process fail to account for empirical observations in the data. These include the attentional drift diffusion model (aDDM) and models that derive attention-choice associations from an optimal policy. The authors show that a model that assumes that gazes are directed towards the chosen option after decision commitment captures more (but not all) empirical findings, suggesting that attention may reflect decisions once they are made instead of contributing to their formation. However, this post-decision-gaze (PDG) model failed to capture all aspects of the data, suggesting that gaze may reflect both decisional and post-decisional operations, and existing models are still missing some features of the gaze-directing process. The authors provide convincing evidence that post-decision gaze explains a number of empirical findings in this task.

      Strengths:

      (1) The analyses are generally appropriate, and the conclusions are supported by the data.

      (2) The study was rigorous, as the authors considered a number of alternative possible models for behavior, and evaluated their performance based on a wide range of qualitative predictions (as opposed to exclusively relying on model comparison).

      (3) The proposal that gaze may largely reflect post-decisional processes is interesting, and as far as I am aware, novel.

      Weaknesses:

      There was limited discussion about why one might allocate attention post-decision. I would have appreciated more discussion on the potential functional consequences or implications of post-decision gaze.

    1. eLife Assessment

      This study uses convincing modeling methods and analyses of rich behavioral datasets to investigate the role of attention in value-based decision making; for instance, as when choosing between two snacks. The results are valuable, as they challenge existing theories that assume that paying attention to an available option biases the eventual choice toward that option. The results suggest that the correlation between attention and decision-making is formed largely after and not before the (internal) choice process has terminated, a finding that offers an intuitively appealing rethinking of how attention and decision-making processes interact during value-based choices.

    2. Reviewer #1 (Public review):

      Summary:

      This study examines whether gaze direction actively shapes choice during food preference decisions or whether gaze and choice evolve largely independently until the moment of commitment. The established framework in this context, the aDDM, assumes that gaze causally biases the accumulation of evidence in favour of the fixated item. The authors show convincingly that this model fails to fit key behavioural patterns across several datasets, as do other published models that make the same assumption. The authors propose an alternative model (Post-Decision-Gaze or PDG) in which gaze and decision formation are decoupled: gaze does not influence the decision process, nor is it drawn toward the ultimately chosen item, until after the decision threshold is reached. Only during the motor execution period (after commitment) is gaze directed to the chosen option. They demonstrate that this model fits several observed patterns better than the aDDM and related variants.

      Strengths:

      The work thoroughly considers multiple models and datasets. It advances an interesting alternative perspective on gaze-decision interactions and highlights meaningful shortcomings in existing models. The authors take the time to explain how modelling assumptions produce specific patterns in the data, which is certainly insightful to readers interested in the modelling of value-based decision making.

      Weaknesses:

      It is unclear to what extent the model's success relies on the way non-decision time is formalised in the model. In the proposed PDG model, non-decision time is decomposed into separate visual encoding, saccadic execution, and manual execution components. Several values (assumed or recovered) do not match known physiological or behavioural ranges. This is a common issue in the literature, and the authors may want to address it in light of broader work discussing what non-decision time consists of in both manual and saccadic actions (e.g., Bompas et al., 2024, Non decision time: the Higgs boson of decision, Psychological Review).

      In particular, the "saccadic execution" parameter appears far too long and too variable to reflect merely execution; instead, it likely includes decisional components. This would make more sense since manual and saccadic planning essentially rely on distinct brain areas, hence it seems unrealistic that crossing a single threshold would trigger both manual and saccadic execution. Similarly, recovered manual non-decision times are substantially longer (though not more variable) than expected motor execution durations for button presses. These patterns suggest that parts of what the model treats as non-decision time are likely decisional in nature, although perhaps related to "action decision" rather than the "value-based decision" of interest to the authors. To what extent these two processes neatly follow each other or overlap could be usefully considered.

    3. Reviewer #2 (Public review):

      Summary:

      Zylberberg et al. reanalyze eye-tracking and behavioral data (mostly from Krajbich et al., 2010) to test two predictions of the attentional Drift Diffusion Model, finding that these predictions are not met. Similarly, predictions of normative models (inspired by rational inattention) are not in line with the data, and the authors propose a post-choice model of attention. This model better accounts for the two effects but also does not account for all patterns, so the authors conclude that eye movements most likely reflect both pre- and post-decisional processes.

      Strengths:

      A clear strength is the systematic falsification-based approach of the paper, establishing (partially) new predictions and testing to what extent these are met by extant models and by a newly developed theory. The authors do a good job in providing intuitions behind the effects and the reasons why models such as the aDDM predict them. The paper is of substantial relevance for the field, as it shows that effects pertaining to the last fixation(s) should be interpreted with caution. Another strength is the paper's transparency as the authors clearly acknowledge that their new model does not do a perfect job either.

      Weaknesses:

      The paper focuses on analyzing the Krajbich 2010 data, but shows that the second effect replicates in many other datasets. A more principled approach, in which both effects are analyzed and presented for all datasets, would be more convincing. The results should then be shown together for clarity/readability.

      Similarly, it would be nice to show to what extent the models' predictions depend (not depend) on using the best-fitting parameter values (are there any parameter settings under which the two effects are not predicted?)

    4. Reviewer #3 (Public review):

      Summary:

      In this study, the authors reanalyzed choice, RT and gaze datasets collected from human subjects performing a food-choice task. They show that models that posit a causal role for attention in shaping the decision-making process fail to account for empirical observations in the data. These include the attentional drift diffusion model (aDDM) and models that derive attention-choice associations from an optimal policy. The authors show that a model that assumes that gazes are directed towards the chosen option after decision commitment captures more (but not all) empirical findings, suggesting that attention may reflect decisions once they are made instead of contributing to their formation. However, this post-decision-gaze (PDG) model failed to capture all aspects of the data, suggesting that gaze may reflect both decisional and post-decisional operations, and existing models are still missing some features of the gaze-directing process. The authors provide convincing evidence that post-decision gaze explains a number of empirical findings in this task.

      Strengths:

      (1) The analyses are generally appropriate, and the conclusions are supported by the data.

      (2) The study was rigorous, as the authors considered a number of alternative possible models for behavior, and evaluated their performance based on a wide range of qualitative predictions (as opposed to exclusively relying on model comparison).

      (3) The proposal that gaze may largely reflect post-decisional processes is interesting, and as far as I am aware, novel.

      Weaknesses:

      There was limited discussion about why one might allocate attention post-decision. I would have appreciated more discussion on the potential functional consequences or implications of post-decision gaze.

    1. eLife Assessment

      This study provides a valuable contribution to understanding grid-to-place transformations, offering new insights into the structure and reliability of these representations and extending prior work in a meaningful way. The evidence supporting the authors' conclusions is solid, based on careful analyses and well-executed experiments, although clarity and mechanistic interpretation would be strengthened by improving sample size reporting, expanding population-level analyses, and future studies including simultaneous entorhinal-hippocampal recordings. The work will be of interest to neuroscientists studying spatial coding and hippocampal-entorhinal circuit function.

    2. Reviewer #1 (Public review):

      This manuscript investigates how chemogenetic depolarization of medial entorhinal cortex layer II stellate cells reshapes spatial coding in downstream hippocampal CA1. Building on the authors' prior work (Kanter et al., Neuron 2017), the study examines changes in grid cell subfield firing rates and CA1 place cell firing patterns after CNO administration. A central advance of the present work is the use of the same manipulation on two consecutive days. The authors show that the induced grid subfield rate changes are highly similar across days and that CA1 place field reorganization is likewise reproducible across days. In addition, they report that CA1 remapping after CNO is not arbitrary. The new main place field often emerges at a location that can be anticipated from the baseline rate map of the same cell, typically corresponding to a weak secondary peak outside the primary field. Finally, the authors demonstrate that these experimental findings can be recapitulated in a feedforward grid to place cell model by selectively redistributing grid subfield firing rates, supporting the interpretation that grid subfield rate changes are sufficient to drive predictable and reproducible place field reorganization.

      Overall, this study is positioned as a follow-up to the authors' previous report in which the main phenomenon (grid subfield rate remapping and accompanying CA1 place cell remapping following chemogenetic depolarization of MEC layer II neurons) was already established. While the conceptual novelty is therefore incremental, the present manuscript adds important and convincing evidence about two key properties of this phenomenon, including its reproducibility across days and the extent to which the direction of place field reorganization is predictable from baseline activity. The experimental approach and analyses appear generally appropriate and carefully executed, and the inclusion of modeling strengthens the mechanistic interpretation. These results provide useful new insight into stable input-output relationships within the entorhinal hippocampal system, and the work will be of interest to researchers studying remapping and the grid to place cell transformation.

    3. Reviewer #2 (Public review):

      Summary:

      Hippocampal remapping - the collective reorganization of neural tuning properties - is thought to be a crucial determinant of memory outcomes. Understanding its mechanistic bases is a fundamental goal of neuroscience and likely to be critical to understanding memory in health and disease. Here, Lykken et al. 2025 leverage a unique empirical manipulation paired with computational modeling to investigate how one mechanism - reorganization of grid cell subfield firing rates - impacts hippocampal remapping. The authors find that repeated chemogenetic excitation of MEC stellate cells induces reliable reorganization of grid cell subfield firing rates, which is in turn coupled with reliable hippocampal remapping. Notably, the authors show that this hippocampal remapping is not random but predictable, with changes in field location that can be predicted based on weak out-of-field firing observed during control sessions. These findings were well-replicated by a simple model of grid-to-place transformation.

      Strengths:

      This work has many strengths. One key strength of this work is its compelling demonstration that chemogenetic activation of stellate cells induces changes to the grid and place cell representations, which are reliable across repeated activations. This reliability means that the functional changes induced by this manipulation are not merely noise but rather contain a consistent structure that can be investigated to gain insight into the entorhinal-hippocampal transformation. Similarly, the demonstration that hippocampal remapping during this manipulation is not random, but predictable at the single-cell level, is also a strength. This predictability can help us distinguish competing mechanisms of remapping and place field formation more generally. Finally, by reproducing key experimental outcomes with a straightforward grid-to-place computational model, the authors show that this relatively simple model is sufficient to understand their results.

      Weaknesses:

      This work also has limitations that leave some relevant questions open at this time. One such set of questions which might be addressable with the author's data and modeling concerns population analyses. Do grid fields at similar locations exhibit similar changes in field properties, or do these fields change independently? Are changes in field location consistent or inconsistent among simultaneously recorded place cells? Would we expect or not expect such a structure given the model? These results might help discriminate between different mechanisms possibly at play.

      Another limitation of this work is its reliance on a single measure of predictability. While this is a great start, and the various controls and modeling are appreciated, I wonder whether the modeling could be used to generate additional verifiable predictions. For example, perhaps analyzing whether there is or is not structure to unpredictable errors (are these distributed around predictions but further away, or are they random)?

      Finally, one limitation comes from the between-group nature of the recordings. Because the MEC and hippocampus are recorded in separate groups of animals, the authors lose the ability to test whether each mouse's particular grid field reorganization predicts its particular pattern of remapping. If the author's model is correct, then one might hope to be able to predict with even higher accuracy the particular patterns of remapping in CA1 given sufficiently well-characterized grid field changes. This ambitious goal would require simultaneous recordings from the hippocampus and entorhinal cortex, which are beyond the scope of the current work, but would ultimately yield even more compelling evidence of the grid-to-place transformation underlying this form of remapping.

  2. Apr 2026
    1. eLife Assessment

      This work provides a map of enhancer-promoter interactions associated with genes controlling the development of a specific neuronal cell population. The study offers a valuable resource and integrates multiple complementary datasets to provide insights into regulatory mechanisms, although the conceptual advances are moderate and the central message could be clearer. The evidence supporting the conclusions is generally solid, but the lack of direct functional testing of key regulatory elements limits the strength of some claims.

      [Editors' note: this paper was reviewed by Review Commons.]

    2. Reviewer #1 (Public review):

      This study by Riegman & George et al. investigates the roles of the chromatin remodeling factor CHD7 and the proneural transcription factor Atoh1 at enhancers in cerebellar granule cells (GCs). Enhancers were categorized based on epigenetic marks and cross-referenced with promoter capture-HiC, ATAC-seq, and expression datasets to identify their long-range target genes, which were found to be enriched for critical neurodevelopmental processes. Differential expression and chromatin accessibility analyses in CHD7 knockout (KO) conditions suggest that this factor regulates a significant number of enhancers. These same enhancers are enriched for proneural transcription factor motifs, with Atoh1 being the most frequently present and likely the most affected. Finally, the direct interaction between CHD7 and Atoh1 was assessed via co-immunoprecipitation in co-transfected cells.

      While the paper presents an interesting aspect of enhancer regulation in neurodevelopment, several points warrant attention:

      Major Strengths:

      The use of chromatin marks increases the resolution of promoter-interacting enhancer regions when integrated with capture-HiC, refining the identification of distal enhancers. Additionally, performing promoter capture-HiC experiments for the first time in this cell type constitutes a valuable resource for the community working on 3D genome organization and neurodevelopment.

      Major Weaknesses:

      As noted by the authors, limited sequencing depth reduces confidence in the conclusions and may result in missed weaker long-range interactions. Furthermore, the absence of capture-HiC and Atoh1 ChIP-seq experiments in the KO condition prevents direct comparison, thereby limiting the strength of the conclusions.

      Additional Consideration:

      Caution should be exercised regarding the assumption that every enhancer must physically contact its target promoter. While true for many enhancers, some act in trans through eRNAs or lncRNAs without direct physical contact.

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors aim to identify active, long-range regulatory interactions in cerebellar granule cell progenitors (GCps). As such, the authors perform promoter capture Hi-C to map long-range interactions for all gene promoters, using cells isolated from P7 mouse brain samples. While the resolution of these maps is limited by the relatively large fragment sizes generated from a 6-bp cutter, the authors combine these interactions with other available published datasets, including from their own previous work, (e.g. ATAC-seq and ChIP-seq) to more precisely map putative enhancers within the long-range interacting regions of captured promoters. The paper further focuses on the importance of transcription factor Atoh1 and chromatin remodeller CHD7 in regulation of these putative enhancers in GCps. The authors suggest a direct interaction between CHD7 and Atoh1 by overexpression and co-immunoprecipitation in human embryonic kidney cells.

      As stated by the authors, this study represents a valuable resource for researchers interested in the identification of enhancers in GCps cells, and their linked target genes. While broadly descriptive, the study does highlight some gene loci of interest and of biological relevance. For example, through integration of previously published datasets, the study resolves which putative regulatory elements at the Reln locus may regulate its activity.

      This manuscript will be of interest to researchers interested in analysing long-distance targets of as well as researchers trying to understand the precise gene regulation in cerebellar development. It may also be of interest to clinical geneticists to interpret novel putative non-coding disease mutations.

      Strengths:

      The strengths of this manuscript are the integrated approach to identify cell-type specific enhancers utilizing available epigenomic datasets, and leveraging 3D genome topology to directly link them to their target genes. For example for the Reln gene previously implicated in cerebellar phenotypes for CHD7 mutants. The pcHi-C dataset generated in this study provides a valuable reference for the community of enhancer-promoter pairs for a specific cell-type of interest with human disease relevance.

      Weaknesses:

      The limitations of the study are partially addressed in the text by the authors, including the resolution from the pcHi-C using a 6-bp cutter, the limitation of sequencing depth (more interactions may have been identified with more depth), and the limited of correlation between replicates (likely due to undersampling the library). Page 9 "some additional interactions with the nearest gene promoters might be identified in our pcHi-C dataset with deeper sequencing".

    4. Reviewer #3 (Public review):

      Summary:

      In this work, Riegman et al. establish the promoter interactome of cerebellar granule cell progenitors (CGPs) and identify thousands of putative enhancers regulating key genes in this cell population. The authors isolate primary CGps cells from the mouse cerebellum and perform promoter capture Hi-C in order to reanalyse previously generated epigenomic datasets (ATAC-seq, H3K4me1/3, H3K27ac) in these cells. They identify 22'797 enhancers interacting with gene promoters. The authors then use CHD7 ChIP-seq experiments to better annotate regulatory regions linked to genes deregulated upon CHD7 loss of function. After observing that CHD7 is frequently co-bound with ATOH1, they compare the binding profiles of ATOH1 and CHD7 together with genes deregulated in loss-of-function datasets, and refine the regulatory elements associated with each of these proteins.

      Strengths:

      The work is well designed and carefully executed, leading to an enhancer-promoter (E-P) interaction cartography that largely surpasses the current standard in the field. The pc-HiC dataset enables a deeper analysis of previously generated datasets (ChIP-seq and loss-of-function), which clearly improves the understanding of the mechanisms underlying CGps proliferation and differentiation. Moreover, the integration of published loss-of-function datasets for CHD7 and ATOH1 is relatively novel in this type of study and helps reduce the purely descriptive nature of the work. In particular, the analysis sheds light on genes with potential functions in CGps that had not previously been identified, as well as their regulatory connections. Overall, the study is convincing and supports the conclusions presented by the authors.

      Weaknesses:

      (1) A substantial part of the manuscript focuses on E-P interactions in CGPs, which gives the impression that this is primarily a genome organisation study. However, in this regard the manuscript does not bring major conceptual novelties. In contrast, the biological insights related to CGPs and the identification of new candidate genes likely represent the most novel aspect of the work. The authors should clarify the central message of the manuscript and reorganise the presentation of the results accordingly.

      (2) The numbers presented throughout the manuscript are sometimes confusing. For instance, the authors initially report 106'589 PIF (line 175), but later only 61'928 (line 243) when calling enhancers. The relationship between these numbers is not straightforward. More generally, simplifying the nomenclature used to describe interaction analyses would help emphasise the biological insights rather than the computational framework.

      (3) ATAC-seq alone is a relatively poor predictor of enhancers. In this context, H3K27ac would provide a more accurate marker of enhancer activity. This point is particularly important because the authors' data suggest that CHD7 does not function as a pioneer factor capable of opening chromatin. Instead, this role appears to be more closely associated with ATOH1. Therefore, alterations in CHD7 are more likely to affect enhancer activity (reflected by H3K27ac) rather than chromatin accessibility itself. If the authors do not have access to H3K27ac ChIP-seq data, this limitation should be explicitly acknowledged.

      (4) The authors do not functionally test most enhancers and instead discuss primarily putative enhancers (with the exception of VISTA-tested elements). Although the term "putative enhancer" appears in some subsections, it is not consistently applied throughout the manuscript. This limitation should be clearly stated early in the manuscript with a sentence such as: "As these regions have not been functionally validated, they should be considered putative enhancers. However, for simplicity, we will refer to them as enhancers throughout the manuscript."

      (5) Where feasible, the enhancer identified at the Reln gene should be functionally tested to demonstrate the added value of the approach.

    5. Author response:

      General Statements

      We thank the reviewers for their careful and supportive reviews of our manuscript. We have addresses all the reviewers comments and extensively revised the manuscript accordingly.

      During our revisions, we discovered a bug in the code that calculated the linear genomic distance between the captured promoter regions (bait regions) and the promoter-interacting fragments (PIFs). The error inadvertently halved the distance measurements in the output tables. This has been corrected in the revised manuscript and has resulted in updates to Figure 1B and corrected values in the ‘interaction_distance’ and/or ‘interaction_type’ columns of Supplementary Tables 2, 3, 6 and 8. We thank the reviewers for the opportunity to correct this.

      Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity):

      In this article, the authors conducted promoter-capture HiC experiments (pcHiC) in Mouse Cerebellar granule cell progenitors (GCps) and obtained a good set of 3D genome interactions map of protein-coding genes' promoters. This dataset was later integrated with ATAC-seq and ChIP-seq experiments to identify putative enhancer regions within promoter-interacting regions, and with higher base-pair resolution than what is obtained by pcHiC experiments. This set of enhancers is then compared to and presented as being more reliable than those present in VISTA enhancer database. In addition, ATAC-seq sites and RNA-seq datasets, both obtained in WT and CHD7 and KO conditions, are integrated to correlate expression of a set of genes to the chromatin accessibility of their distal enhancer(s) which is believed to be promoted by CHD7. The study is completed by focusing on transcription factor motif analysis on CHD7-regulated enhancers which shows an enrichment for proneural transcription factors, with special emphasis on Atoh1 found to be frequently co-recruited with CHD7. Data and methods are well detailed and correctly replicated and will be useful as a resource for the community. The overlap obtained between pcHiC experiments and auto-criticized by the authors is very common and expected in this kind of experiments. In general, the conclusions drawn the article are convincing but some aspects such as comparison to VISTA and the naming of 'enhancers' should be moderated.

      We thank the reviewer for their positive and constructive comments. We have amended the manuscript as indicated in detail below.

      (1) The comparison of pcHiC-identified enhancers vs. VISTA enhancers should be more balanced, as the two approaches have important conceptual differences. Although VISTA enhancers are based on functional annotation, their target genes might not necessarily be correctly assigned based on the distance. On the other hand, putative enhancer regions identified by pcHiC experiments do not rely on functional testing. So both type of information are useful but can be put in perspective.

      We thank the reviewer for making this point. We have amended the text to present a more balanced view e.g. “Using VISTA-designated hindbrain enhancers as an example, we identify the genes most likely regulated directly by these enhancers and update their annotation accordingly.”

      (2) To increase the strength of the paper, it would be preferable that authors include simple functional enhancer assays (e.g. CRISPR deletion of contacting enhancer, luciferase assay) to support their perspective since 3D conformation information in KO condition is lacking in the article. Although ideally these experiments should be better performed for a full demonstration, it would be acceptable to at least include a simple functional assay in the WT context to demonstrate that the regulatory regions obtained by crossing genomic data are real enhancers. This point is even more critical knowing that enhancers lacking classical histone marks (H3K27ac+H3K4me1) has been described. The same comment applies to promoter interacting fragments lacking these marks, that could be missing enhancers (i.e enhancers without these marks).

      To address this point, we performed luciferase assays to show that putative enhancers identified with our integrated bioinformatic approach (pcHi-C + ATACseq + H3K4me1 + H3K27ac) do indeed exhibit enhancer activity. For these experiments, we tested these putative fragments in an immortalized cell line SHH-NPD, a GCp-derived cell line generated by Fults laboratory (Jenkins et al. 2014). The results of these experiments are included as Suppl. Fig. 1 in the revised manuscript.

      Minor point

      - Figure 5B is lacking labels.

      We apologise for this oversight – labels have now been added.

      Reviewer #1 (Significance):

      This article, when completed with possible revision, will be be useful for the community in terms of useful resource of experimentally determined putative enhancers in Cerebellar granule cell progenitors. It also provides some insights into the association of CHD7 and Atoh1 in distal regulation in these cells.

      We thank the reviewer for acknowledging the significance of our work.

      Reviewer #2 (Evidence, reproducibility and clarity):

      In this manuscript, the authors aim to identify active, long-range regulatory interactions in cerebellar granule cell progenitors (GCps). As such, the authors perform promoter capture Hi-C to map long-range interactions for all gene promoters, using cells isolated from P7 mouse brain samples. While the resolution of these maps is limited by the relatively large fragment sizes generated from a 6-bp cutter, the authors combine these interactions with other available published datasets, including from their own previous work, (e.g. ATAC-seq and ChIP-seq) to more precisely map putative enhancers within the long-range interacting regions of captured promoters. The paper further focuses on the importance of transcription factor Atoh1 and chromatin remodeler CHD7 in regulation of these putative enhancers in GCps. The authors suggest a direct interaction between CHD7 and Atoh1 by overexpression and co-immunoprecipitation in human embryonic kidney cells.

      As stated by the authors, this study represents a valuable resource for researchers interested in the identification of enhancers in GCps cells, and their linked target genes. While broadly descriptive, the study does highlight some gene loci of interest and of biological relevance. For example, through integration of previously published datasets, the study resolves which putative regulatory elements at the Reln locus may regulate its activity.

      We thank the reviewer for their supportive comments.

      We provide a summary of our major and minor comments here.

      Major comments:

      (1) The main take-home messages of the manuscript could be more clearly stated in the introduction to help readers understand the main conclusions of the work.

      We have added a sentence to the Introduction to clarify the key take-home messages:

      “We report putative distal regulatory elements for >12,000 genes, identify CHD7- and Atoh1-regulated enhancer elements and show that these factors interact and likely co-regulate the expression of key genes in the GCp lineage.”

      (2) In the discussion, a previous Hi-C dataset is referred to "Reddy et al. annotated 5,175 promoter-enhancer interactions in GCps using Hi-C without enrichment (Reddy, Majidi et al. 2021)." It would be beneficial to compare the interactions identified previously with the current study (5,175 vs 46,428 interactions).

      To address this comment we have performed an additional analysis and include text and Suppl. Figure 3 and Suppl. Table 13 to demonstrate the extent the two datasets compare, overlap and diverge. We have also added additional text to the discussion to highlight the difference and technical considerations between the two approaches and how they complement each other.

      The 5,174 enhancer-promoter (E-P) interactions identified by Reddy et al were downloaded and intersected with the 46,428 promoter-accessible PIF regions identified in our study. The new supplementary Figure 3A illustrates that 82% (843/1207) of genes that Reddy et al identifies long-range interacting regions for are represented in our pcHiC dataset. Our pcHiC data contains information on distal interacting regions and potential enhancer regions for an additional 11,511 protein coding genes. Suppl. Figure 3B provides an overview of the Reddy et al E-P interactions that are, and are not identified in the pcHiC. We replicate 38% of Reddy et al’s E-P findings, whilst 53% of the 3229 interactions unique to the Reddy data would not be detected in the pCHiC data due to technical reasons resulting from the capture design and analysis protocol. Of the remaining interactions that are specific to the Reddy data, we identify other distal regions interacting with those same promoters . Suppl. Table 13 details the full comparision of Reddy’s E-P interactions that are found within our dataset.

      The differences between the two datasets and the increased number of interactions detected in the pcHiC dataset likely result from the increased enrichment for the captured promoters enabling the detection of interactions that would have been below the detection threshold for the HiC study. In addition there are notable differences in analysis strategies for the two datasets which also contribute to differences in detection of regions. Reddy et al binned the HiC data into 10Kb regions to identify interacting regions and subsequently used chromatin marks to identify possible enhancer and promoter regions within these large regions. In contrast we have used the pCHiC and CHiCAGO algorithm to identify individual HindIII restriction fragments that are proximal to targeted promoter regions (PIFs), and prioritised those that have accessible regions within them which could represent various types of regions that play regulatory roles such as enhancers, CTCF site or facilitator regions, independent of their chromatin mark composition rather than focusing solely on enhancers.

      (3) The authors identify an overlap with some of their identified enhancers with those from VISTA. Is this a fair comparison seeing as the enhancer reporters were tested during early embryonic development (e.g. E11.5 and E13.5) and seen to be active in the hindbrain, would these stages be relevant to GCps from P7? Can the authors identify ATAC-seq for example from hindbrain from embryonic stages and determine if the enhancer accessibility profile looks similar to that for the P7 GCps cells?

      We thank the reviewer for this important question regarding the developmental relevance of our VISTA comparison and acknowledge that direct comparison between the time point requires careful consideration. Firstly ,to address the question of how similar the chromatin accessibility profiles are between the embryonic and P7 timepoints, we compared the ATAC-seq data from our paper to ENCODE data from the hindbrain. Of the 140 vista enhancers that were intersected with the pCHi-C dataset, 119 were identified from the lacZ studies as active in the hindbrain at E11.5 whilst 21 were identified as active at timepoint E12.5. We compared ENCODE ATAC-seq peaks from the E11.5 (ENCFF743IYX) and E12.5 ( ENCFF198TLF) hindbrain to the GCps from P7 across both the entire genome (global accessibility) as well as specifically +/- 3MB around the VISTA enhancer regions in the PIFs from the pCHiC to assess the conservation of local accessibility profiles.

      When looking at the global accessibility profile of embryonic hindbrain versus P7 GCps across the whole genome there was a large degree of overlap with ~85% (E11.5) and ~88% (E12.5) of all ENCODE ATAC peaks overlapping with accessible ATAC summit regions from P7 GCps:

      Author response image 1.

      To identify if this was consistent in the immediate chromatin environment of the VISTA enhancers themselves, we compared the accessibility profiles across timepoints in the local environment surrounding the VISTA enhancers. This local environment was defined as a region that added an additional 3MB on either side of all VISTA enhancer positions found in PIFs. 3MB was chosen as the longest interaction found for a single VISTA element was approximately 2.7MB. Consistent with the global analysis a similarly high level of overlap of accessible regions between the timepoints was found for the local chromatin environment in surrounding the VISTA enhancers that were found within PIFs in the pCHiC dataset with ~87% (E11.5) and ~89% (E12.5) of encode detected peaks overlapping with accessible ATAC summit regions from P7 GCps.

      Author response image 2.

      Regions +/-3MB of VISTA enhancers in PIFs

      Author response image 3.

      Regions +/-3MB of VISTA enhancers in PIFs

      Genome browser shots at the three example VISTA loci from Figure 1 further support this approach. In addition to this we also note that a recent study by Chen et al (2024 https://www.nature.com/articles/s41588-024-01681-2) where capture-HiC performed at E11.5 of 935 VISTA enhancers across multiple tissues confirmed that the majority of VISTA enhancer regions (61%) bypass adjacent genes which is consistent with our nearest gene comparison.

      (4) The co-IP experiment appears to support the conclusion that Atoh1 and CHD7 can interact, however there are bands in lanes where there should not be (i.e. Input lanes 1 and 4 for FLAG blot). It would be recommended to repeat this result at least once. [Expected time 2-4 weeks].

      This experiment has been repeated 3 times with the same result. It is normal for non-specific background bands to appear on Western blot from total cell lysates (inputs) as most antibodies have significant cross-reactivity. The anti-FLAG antibody clearly detects bands above background in lysates where FLAG-tagged CHD7 is expressed. Most critically, despite the presence of non-specific bands in input, FLAG-tagged CHD7 is only detected in immunoprecipitated samples where either FLAG-tagged proteins have been precipitated and FLAG-tagged CHD7 is expressed and HA-tagged Atoh1 has been precipitated when both FLAG-tagged CHD7 and HA-tagged Atoh1 are expressed.

      (5) The methods section describes analysis of several datasets, however we could not access the code at the time of review. Do the authors intend to make this code available at the time of publication?

      Yes once the publication is approved all code will be made available along with conda environment yaml files to replicate the software environment in which the analysis was performed.

      (6) Page 7 "replicate one and two, respectively". Can the authors clarify the number of biological replicates performed for pcHi-C?

      Two biological replicates were performed for pcHiC which were then bioinformatically combined into a ‘superset’ for CHiCAGO interaction calling as is standard practice for pcHiC data (see e.g. Cairns et al, 2016. We have revised the text to make this clearer.

      Minor comments:

      (1) Page 3 "controlling the expression of 577 genes in GCps" - the authors do not provide evidence that these enhancers control gene expression directly, this should be reworded.

      Thank you. We have reworded to: “contacting the promoters of 577 genes” to indicate that these were identified using pcHi-C and not functional assays.

      (2) Page 5 "where transient amplifying divisions exponentially expand GCps" - at what stages of embryonic/postnatal development are GCps first detected, and when do they amplify and then differentiate?

      GCps that form the EGL are specified in the rhombic lip from E13.5 (Machold, 2005 and Wang, 2005) and a clear EGL can be observed in the cerebellar anlage from E14 (Ben-Arie, 1997) of development. They amplify from this stage and differentiation, induced by neurogenic factors like NeuroD1 is visible from P0 onwards (Miyata, 1999). We have amended the text to include this additional information: “GCps that form the EGL are specified in the rhombic lip from E13.5 (Ben-Arie et al, 1997; Machold & Fishell, 2005) and a clear EGL can be observed in the cerebellar anlage from E14 (Ben-Arie et al., 1997) of development. They amplify from this stage and differentiation, induced by neurogenic factors like NeuroD1 is visible from P0 onwards (Miyata et al, 1999).”

      (3) Page 7 "identified 164,387 unique and significant interactions" - how is an interaction defined, a single read, or evidenced by a certain number of reads. "promoter interacting fragments or PIFs" - is PIF referring to a single read evidencing an interaction?

      An interaction is defined by the CHiCAGO algorithm. The number of reads needed to score an interaction depends on the both the distance away that PIF is from the promoter (this is modelled using a distance-dependent component that accounts for decay of contact frequence with genomic distance) and also includes a component that models how the sequence or other technical artifacts might influence the capture bias of some sequences compared to others. For each promoter a background model is generated of the expected number of reads that would be captured based on the above considerations and if the number of reads for those regions exceeds this background model by a certain threshold the interaction is deemed significant using a p-value like score. In practice this means that regions further from the promoter will often require less reads to signify a significant interaction compared to regions that are much closer to the promoter. The significant PIFs in the dataset are all evidenced by a minimum of 3 reads in at least one biological replicate. We have included a short explanation of this in the methods of the revised manuscript for clarity.

      The maximum reads in a single replicate library for a specific PIF was 1557, and the median number of reads per PIF was 17.

      (4) Page 8. What is the distinct between PIFs and "promoter interacting regions (PIRs)"? These could be better defined in the text.

      Thank you for picking up this discrepancy, we were using PIR and PIF interchany. We have amended the manuscript to refer to PIFs consistently throughout.

      (5) Figure 1C-F. Labels "Random" and "PIFs" don't line up well with the two bars.

      Thank you, this has been corrected.

      (6) Page 9. Could the authors show some representative images for the "VISTA hindbrain enhancers" (e.g. for Figure 1I-K).

      We have inserted representative images showing in vivo activity of these enhancers in mouse embryos from the VISTA enhancer site.

      (7) Fig 2G, Page 11 "The 12,354 genes that were linked to a PIF containing an ATAC-seq peak were found to have a higher median expression level than the 2,049 genes that had PIFs that did not coincide with ATAC-seq peaks" - is this significant?<br />

      Apologies for this oversight. We have performed a two-sided t-test on the log transformed TPMs between the two groups and have included the significance in the revised figure (p=1.8 e-40).

      (8) "Gene Ontology analysis of genes with accessible PIFs revealed a significant enrichment for 119 biological processes" - can you include the GO terms in a supplementary table? Is there a way to prioritise down the 12,354 genes to a shorter more significant list of genes, this seems a long list to include in GO analysis.

      We have included a supplementary table with this data in the revised manuscript (Suppl. Table 6). We included all 12,354 genes in this analysis as the point of this analysis was to demonstrate that developmental processes are enriched in the PIFs with accessible chromatin, compared to the genes where only PIFs without ATAC were identified.

      (9) Page 11 - "The chromatin remodelling factor CHD7 is essential for normal expansion of GCps in the postnatal mouse cerebellum (Whittaker et al., 2017b) and deletion of Chd7 from GCps results in striking cerebellar hypoplasia and polymicrogyria (Feng et al., 2017; Reddy et al., 2021; Whittaker et al., 2017b). CHD7 haploinsufficiency is also sufficient to cause cerebellar hypoplasia and foliation defects both in mouse models and in the context of CHARGE syndrome in humans (Whittaker et al, 2017a; Yu et al, 2013)." - this appears more suitable for the introduction.

      Thank you, we have moved this text to the Introduction.

      (10) Page 12 "the majority of which (4,663/5,369) displayed decreased accessibility when Chd7 is depleted". This was difficult to understand initially - which are expected to be the direct effects? Increased or decreased accessibility? Perhaps it would be better to focus only on the decreased accessibility sites?

      We have previously shown that the majority of differentially accessible regions in Chd7-deficient GCps show decreased accessibility. Chromatin remodelling by CHD7 could conceptually reduce or increase accessibility of a particular locus and the only way to infer direct effects are by identifying regions to which CHD7 is recruited.

      Approximately ~9% of the sites that decreased in accessibility overlapped with regions bound by CHD7 (464/4663), whilst ~2% of sites that increased in accessibility overlapped with regions of CHD7 binding (14/706). Whilst it is likely that the majority of directly regulated sites decrease in chromatin accessibility when CHD7 is removed, the number of sites that increases in accessibility is small but observed and should be included for completeness.

      (11) The analysis in Fig 3A reveals that only a small number of CHD7-bound enhancers show differential accessibility and altered linked gene expression upon CHD7-knock down. This requires a little more discussion - why do so many sites change in accessibility compared to the number of sites which change accessibility or are associated with gene expression change?

      Identifying CHD7-regulated enhancers is challenging, mostly due to the inefficiency of CHD7 ChIP-seq. The low quality of available CHD7 ChIP-seq data has made it particularly difficult to identify CHD7 peaks. However, the integration of this data with ATAC-seq accessibility, chromatin modification and pcHi-C data has allowed us to identify a subset of enhancers that are most likely directly regulated by CHD7. However, given these technical limitations, we would be hesitant to conclude from the present data that the majority of chromatin accessibility changes in enhancers in Chd7-deficient GCps are indirect. We have added the following text to the discussion to indicate this: “Identifying CHD7-regulated enhancers is challenging, mostly due to the inefficiency of CHD7 ChIP-seq. The low quality of available CHD7 ChIP-seq data has made it particularly difficult to identify CHD7 peaks. However, integrating CHD7 ChIP-seq data with ATAC-seq accessibility, histone modification ChIP-seq and pcHi-C data has allowed us to identify a subset of enhancers that are most likely directly regulated by CHD7. However, given these technical limitations, we would be hesitant to conclude from the present data that the majority of chromatin accessibility changes in enhancers in Chd7-deficient GCps are indirect, as suggested by the data in Fig. 3A.”

      (12) Page 12 - "Over-representation analysis confirmed an enrichment of genes linked to nervous system development" - could this and the GO term analysis be included in a supplementary figure?

      We have included these results as Suppl. Table 7 in the revised manuscript.

      (13) Fig 3D - what does the arrow represent in the chromatin schematic?

      The arrow in the schematic indicates chromatin remodelling – we have clarified this in the figure legend and added headings to these panels to indicate the 3 different types of elements: Direct CHD7 targets, Indirect targets and CHD7-bound elements.

      (14) Fig 3G does not appear to be referenced in the text. The value of the Upset plots in the main figure 3 wasn't very clear, perhaps these could be moved to the supplement? Is there a clearer plot to support the conclusion "CHD7 primarily regulates enhancers".

      We apologise, the panels were mis-labeled in the text. This has now been corrected. We hope that the amendments in response to point 13 above now clarifies these findings showing that direct CHD7 targets are characterised by active enhancer marks.

      (15) Page 14 "putative consensus sites for proneural bHLH TAL-family of proteins Neurog2, Neurod2, Neurod1, and, Atoh1 in elements" - HOCOMOCO motifs are only shown for Atoh1 and Nhlh1. It may be valuable to show the sites for all the listed TFs. What does white represent in the heatmap in Fig 3H? This plot is difficult to interpret, and also relatively small in the figure but appears important to conclusions. Perhaps Fig 3H could be made more prominent?

      Thank you for highlighting that the white boxes might be confusing. The white blocks indicate that these motifs do not pass threshold for significantly enriched in the dataset based on the p and q values.This has now been clarified in the figure legend.

      We have enlarged panel H to make more prominent.

      (16) Page 15 - "Myb was the only motif specific to CHD7 bound regions that changed in accessibility compared to those that exhibited accessibility changes without CHD7 binding or CHD7 binding without accessibility changes (Suppl. Fig. 1)." I couldn't interpret this sentence, requires clarifying.

      We agree that this description is confusing and since it is difficult to draw clear conclusions about the significance of enhancers with Myb motifs in this context, we have removed this sentence from the revised manuscript.

      (17) Page 16 and Fig 4B - a discussion of why both up and down regulated genes are detected for Atoh1 depletion? Which class of genes are expected to be directly regulated (the down-regulated genes)?

      Like most transcription factors, ATOH1 may be able to function as both a repressor and activator depending on the context. Although the majority of genes are downregulated in Atoh1-defivcient cells, suggesting that Atoh1 functions as an activator in most cases, our analysis have identified several up-regulated genes that contain Atoh1 ChIP-seq peaks in their cognate enhancers (See Suppl. Table 7), consistent with these also being direct Atoh1 targets.

      (18) Fig 5B - the genomic traces are not labelled in this figure.

      Thank you, labels have been added.

      (19) Page 17 - "Pathway enrichment analysis of the 22 genes compared to all genes that were expressed in GCps shows a significant enrichment of terms: Hypoplasia of the pons (HP:0012110 P=0.006) and Abnormal pons morphology (HP:0007361 P=0.016) from human phenotype ontology, due to the presence of Reln, Dcc, Mab21l1 and Gli2." - this analysis should be included in the supplementary tables.

      These results have been included as Suppl. Table 12 in the revised manuscript.

      (20) Do the authors have a suggestion for which domains of Atoh1 and CHD7 could be interacting? Could the authors design truncated constructs for overexpression in HEK cells to test this hypothesis? [Expected time 4-6 weeks, interesting but not essential to do experimental work here].

      We agree this is an interesting question. Our collaborator, Professor Peter Scambler (UCL) has performed a yeast two hybrid screen for CHD7 interacting proteins in a mouse E11.5 library using the CHD7 BRK domain (aa 2521-2708) as bait. The screen had a single hit, which encompassed the N-term 127aa of ATOH1 (personal communication). This observation supports our co-IP data and suggests that the N-terminus of ATOH1 interacts with the BRK domain of CHD7 but further validation will be needed to confirm this.

      (21) Page 28 "Differential accessibility analysis was performed using DESeq2 (v 1.22.1)" and Page 19 "Whereas chromatin accessibility at some of these enhancers were affected by Chd7-deficiency" - what were the cutoffs used for looking at differentially accessible regions? Complete loss of accessibility or a quantitative change?

      Quantitative change rather than complete loss was used. Thresholds based on adjusted p-values (padj<0.05) were used as indicated in the methods.

      Requested comments on referencing:

      - "Long-range" - how do the authors define long-range? Can this be referenced. CO? good reference here.- look to CHiCAGO paper

      - "When chromatin conformation or 3D organisation data is not available, studies typically assign regulatory elements to the nearest gene promoter" - needs referencing.

      - "Many of these 22 genes regulated by CHD7 and Atoh1 have established critical roles in cerebellar development, including Neurod2, Pax6 and Gli2 (Fig. 5B)" - needs referencing. "from human phenotype ontology, due to the presence of Reln, Dcc, Mab21l1 and Gli2" - needs referencing.

      Thank you, references have been added.

      - "active enhancers (H3K27ac+, H3K4me1+), promoters (H3K27ac+, H3K4me3+), regulatory elements (H3K27ac+, H3K4me1+, H3K4me3+), or poised enhancers (H3K4me1+)" - needs referencing.

      Thank you, references have been added.

      - Reference required in main text for VISTA (e.g. Visel et al., 2007)

      Thank you, reference added.

      Reviewer #2 (Significance):

      The strengths of this manuscript are the integrated approach to identify cell-type specific enhancers utilizing available epigenomic datasets, and leveraging 3D genome topology to directly link them to their target genes. For example for the Reln gene previously implicated in cerebellar phenotypes for CHD7 mutants. The pcHi-C dataset generated in this study provides a valuable reference for the community of enhancer-promoter pairs for a specific cell-type of interest with human disease relevance.

      We thank the reviewer for recognising the potential value of our work to the community.

      The limitations of the study are partially addressed in the text by the authors, including the resolution from the pcHi-C using a 6-bp cutter, the limitation of sequencing depth (more interactions may have been identified with more depth), and the limitated of correlation between replicates (likely due to undersampling the library). Page 9 "some additional interactions with the nearest gene promoters might be identified in our pcHi-C dataset with deeper sequencing".

      We thank the reviewer for highlighting our acknowledgements of the potential limitations of our work.

      Additional limitations include the use of the VISTA browser mouse LacZ embryos to validate some of their enhancers, the limitation here being that the VISTA browser tests enhancers at embryonic stages (focused at E11.5 and E13.5) while the GCps cells were collected at P7. The LacZ images from VISTA are also not shown. The HEK cells used for the co-IP could be seen as a limitation as these are not relevant cells for the cell state studied, the authors could clarify their use of these cells.

      We thank the reviewer for their careful assessment of the limitations of our study. We have now included images of the VISTA enhancers in Fig. 1I,J,K. Rather than a limitation, using irrelevant cells for co-IP might be seen as a better approach, as conceivably the chances of an indirect interaction between the two proteins being tested by a bridging complex is less in an irrelevant cell types that might not contain such complexes. Either way, HEK293T cells is the standard laboratory model for co-IP studies as they can be transfected with ease.

      The study reported here is largely based on previous work from the authors (Whittaker et al 2017b). This study reported that the chromatin remodelling factor CHD7 is essential for normal expansion of GCps in the postnatal mouse cerebellum and deletion of CHD7 from GCps resulted in the phenotype of cerebellar hypoplasia. This study also largely leverages previously published datasets from the Whittaker et al 2017b (e.g. CHD7 deletion data) and reanalyses it in the light of the new pcHi-C datasets.

      This manuscript will be of interest to researchers interested in analysing long-distance targets of as well as researchers trying to understand the precise gene regulation in cerebellar development. It may also be of interest to clinical geneticists to interpret novel putative non-coding disease mutations.

      We thank the reviewer for highlighting the wide interest of our manuscript.

      In assessing this manuscript, my expertise lies in models of human development and gene regulation, with a focus on enhancer function.

      Reviewer #3 (Evidence, reproducibility and clarity):

      Riegman et al have explored the gene regulatory landscapes of cerebellar granule cell progenitors (GCps). They have generated promoter capture Hi-C data to identify regions that interact with promoters in these cells. In addition they generate ATACseq data in wild-type and CDH7 knock-out cells. They integrate these data to identify enhancers that potentially regulate genes in GCps. In addition, the authors identify an interaction between CHD7 and ATOH1, whose binding sites also overlap in the genome.

      The dataset can be potentially interesting for people studying cerebellar development.

      I have a few concerns regarding the paper. The most pressing one is that the authors seem to equate interactions in pcHi-C with regulation. This is problematic for two reasons. First whether interaction equates regulation is still debated and whether this can be detected with a low-resolution C-method (i.e. using HindIII) is a further point of contention.

      We thank the reviewer for pointing this out. We agree and apologise for not being clear in our manuscript. We have made the necessary amendments to indicate that pcHi-C by itself only assess proximity in the nucleus, not function.

      We acknowledge the limitations of the pcHi-C method, including that resolution is limited by the use of a restriction enzyme. However, we (see e..g. Suppl. Fig. 1) and others (see e.g. Freire-Pritchett et al (2017) and Mifsud et al (2015)) have used this approach successfully to identify functional enhancer elements.

      The second issue has to do with the way the pcHi-C data is interpreted. What is detected as a significant interaction by Chicago are regions that have a contact frequence above background. This means that local regions with a (much) higher contact frequency may not be called as significant. When we follow the logic that contact frequency is related to gene activation (which may not necessarily be true) whether a fragment is more frequently contacted than the background should not matter (relative contact frequency), rather it should be interpreted based on the absolute contact frequency.

      The reviewer is right that local regions will have a higher contact frequency and that local contacts aren’t always captured by the CHiCAGO model. However, the purpose of this study was to prioritise the identification of distal elements that are not captured by existing methods including nearest gene annotation.

      There are a number of reasons why absolute contact frequency might not be an appropriate measure to infer gene regulation: 1) Many factors can affect the absolute contact frequency including the proportion of cells that are exhibiting active transcription at that time across a population, especially if expression is limited to a small number of this population at that time. 2) Absolute contact frequency assumes that more contact results in more regulation which is not necessarily true and would depend on the combination of factors that are associated with that regulatory element. Figure 1 from https://www.nature.com/articles/s41596-023-00817-8 - Figure 1 – Micro capture C show that regions with low absolute contact frequency compared to adjacent regions have potential to regulate gene expression, as have other studies that have used CHiCAGO to identify regulatory elements. 3) The sequence of some fragments makes them more likely to captured or enriched in the HiC protocol, which the relative contact frequency above background controls for.

      This becomes relevant because the authors claim that 80% of enhancers are wrongly annotated based on their metrics. The only way to correctly annotate an enhancer is to knock it out and checking the effect on genes in the vicinity. Therefore, to claim that their method can correctly annotate enhancer is grossly overstated, particularly when considering the issues with contact frequency stated above. Therefore, claims like 80% of enhancers are wrongly annotated should be removed from the paper. The authors should discuss how to annotate enhancers, in the Discussion and what the proper method is for annotations.

      We have amended the text to indicate that we do not suggest that VISTA enhancers are wrongly annotated but incompletely assigned. We apologise for making this suggestion in the first draft. There is however complementary evidence from Cheng et al (2024), now referenced in the revised manuscript, that also find 60% of the VISTA enhancers skip their adjacent gene. It is also well established in the literature that nearest genes are not always regulated.

      Other points:

      - The authors claims that PIFs have 2.14 and 2.69 fold enrichment of H3K4me1 and H3K27ac sites. Did the authors use the whole genome as background. If so, they should take into account that promoter are more likely in regions of high gene density, which are more dense in active marks. It would be better to perform local, circular permuation of the the PIFs around the promoter.

      The reviewer is correct that a whole genome background is not an appropriate background for testing enrichment of active marks within PIFs. Fortunately, this is taken into account in the CHiCAGO enrichment test which selects the background from fragments that are matched to the same distance of the PIFs to account for the observation that promoters are more likely in regions of high gene density and are therefore more enriched for active chromatin modifications.

      - The authors talk about "lead PIF", which is the fragment with the "most significant CHICAGO score". What does this mean? Something is significant or not, despite common misuse of the term there is no gradient of significance.

      The reviewer makes a good point here and we apologise for the oversight in wording and have corrected the text to be more specific that the lead PIF is the one with the highest ChiCAGO score.

      - In the GO analysis the categories with the lowest p-value are presented, but this biases for large categories. It would be more relevant to also select for and show the enrichment scores.

      We agree with the reviewer that a drawback of GO analysis is that it biases for large categories and that if by ‘enrichment score’ the reviewer means the –log10(p-value) we have included that in the supplementary tables which also includes the size of the category and number of genes detected in it.

      Reviewer #3 (Significance):

      The study provides a dataset that may be interesting for people studying cerebellar development. In that sense the data is mostly interesting from a fundamental viewpoint. The data seem of good quality.

      The authors claim that they a very sizeable fraction of enhancers are misannotated, but I do not believe that this is correct.

      We thank the reviewer for pointing this out. We apologise for creating the impression that VISTA enhancers are incorrectly annotated. We have amended the text to reflect that these are incompletely annotated.

      My expertise is 3D genome, bioinformatics.

    1. eLife Assessment

      This important study concerns the propagation of waves in bacterial biofilms, bridging active matter physics and bacterial biophysics. The experimental observations are solid, and the theoretical interpretation and model validation have been refined with revisions. This work will be of interest to microbiologists, biophysicists, and researchers studying collective behavior in biological systems.

    2. Reviewer #1 (Public review):

      Summary:

      Overall, this is an interesting paper. The authors identify several experimental knobs that can perturb mechanical wave behavior driven by pili feedback. They frame these effects in terms of nonreciprocal interactions. While nonreciprocity could indeed play a role, it raises the question of whether mechanical feedback might also contribute. Phenomenological models can be useful, but the model currently lack direct mechanistic insight. It would be more compelling to formulate the model around potential mechanochemical feedback, which could help clarify the underlying microscopic mechanisms.

      Strengths:

      Report of mechanical waves in bacterial collectives, mechanism has potential application in multicellular context such as morphogenesis.

      Weaknesses:

      A minor concern about the language of 'left-right asymmetry.' I believe the correct term is simply 'radial asymmetry' which is a distinct concept. Left-right is not well defined in the current context.

    3. Reviewer #3 (Public review):

      Summary:

      The revised manuscript presents a compelling study of radially propagating metachronal waves on the surface of Pseudomonas nitroreducens biofilms, combining experiments with two theoretical descriptions (a local phase-oscillator model and an active solid/active gel model). The central experimental findings-spiral/target/planar wave patterns, their controllability via water/PEG/temperature perturbations, and the correlation between frequency gradients and propagation direction-remain highly interesting and relevant to both bacterial biophysics and active-matter physics. The revised manuscript also adds substantial new material, including additional analyses of defect dynamics and clearer discussion of the relationship between the two models. The study continues to have a strong interdisciplinary appeal and the potential to stimulate further work on collective oscillations in biological active media.

      Strengths:

      The authors have substantially addressed the major conceptual issue raised in the previous round by clearly distinguishing between nonreciprocity and frequency gradients / global asymmetry. This clarification significantly improves the theoretical interpretation and resolves an important source of confusion in the original version.

      The revised manuscript also improves the connection between the phase-oscillator and active-solid descriptions. In particular, the authors now explain more explicitly how the phase variable is defined in the reduced oscillatory dynamics of confined biofilm motion, and they state that they added a schematic illustration and simulation details (including parameter values and the elastic-force definition) to improve reproducibility. This directly addresses one of my previous major concerns.

      A notable improvement is the newly added defect-based analysis of waveform transitions (spiral -> target -> planar). The revised text argues that defect motility is a key control parameter, linked experimentally to moisture-dependent elasticity and theoretically to nonreciprocity / defect-pair stability. This provides a more concrete mechanistic bridge between experimental perturbations and the modeling framework than in the previous version.

      The manuscript now gives a clearer experimental-theoretical narrative for how environmental manipulations (drying, water addition, PEG, heating) affect wave patterns through changes in effective elasticity and activity, including a useful distinction between short-timescale and long-timescale temperature effects. This added discussion strengthens the biological interpretation and makes the modeling assumptions easier to follow.

      Weaknesses:

      The main remaining limitation is the level of quantitative correspondence between theory and experiment. The revised manuscript now provides a stronger qualitative/mechanistic link, but the mapping between model parameters (e.g., effective coupling terms / elasto-active parameters) and directly measurable biofilm properties is still limited. The authors acknowledge this point, and I agree that it is technically challenging in the present system. However, this means the theoretical framework is currently most convincing as an effective mechanistic model rather than a quantitatively predictive one.

      Relatedly, some conclusions about parameter-level control (especially in connecting moisture/temperature manipulations to specific model parameters) remain qualitative. I do not view this as fatal, but I recommend that the manuscript clearly state this scope and avoid overstating the quantitative predictive power of the theory.

      Although the terminology has improved compared with the original version, the revised manuscript still uses "left-right asymmetry" in places where the underlying geometry and symmetry are more general (e.g., radial inward propagation in circular colonies). Since this wording was one of the original points of confusion, I suggest one final pass to ensure the symmetry language is consistently precise throughout the main text and figure captions.

    4. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This important study concerns the propagation of waves in bacterial biofilms, bridging active matter physics and bacterial biophysics. While the experimental observations are solid, the theoretical interpretation and model validation are currently incomplete and require further refinement. This work will be of interest to microbiologists, biophysicists, and researchers studying collective behavior in biological systems.

      In the revised manuscript, we have added new experimental results that strengthen the connection between our observations and the modeling framework used to interpret the collective oscillations. We have not introduced a new theoretical model; rather, we employed established active matter models and sought to link the observed phenomena to these frameworks. In particular, our new data demonstrate that the transition between the motile and biofilm-forming states specifically modulates the elasticity and elasto active coupling of the bacterial structure. This behavior is in excellent agreement with the predictions of the active solid model. All the experimental details are given below. We believe that the revised version of the manuscript now establishes this connection more clearly and convincingly.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Overall, this is an interesting paper. The authors have found multiple experimental knobs to perturb a mechanical wave behavior driven by pilli feedback. The authors framed this as nonreciprocal interactions - while I can see how nonreciprocity could play a role - what about mechanical feedback? Phenomenological models are fine, but a lack of mechanistic understanding is a weakness. I think it will be more interesting to frame the model based on potential mechanochemical feedback to understand microscopic mechanisms. Regardless, more can be done to better constrain the model through finding knobs to explain experimental observations (in Figures 3, 4, 5, and 7).

      We thank the reviewer for the positive assessment and for highlighting this important point. The reviewer is correct that the phenomenological Kuramoto-based model does not explicitly show the detailed cell–cell interactions. However, the active solid model is formulated on detailed elastic couplings and active forces, which inherently represent mechanical feedback within the biofilm structure. In this framework, nonreciprocity emerges naturally from the tensorial nature of active forces between bacteria—a concept already well established in the active matter literature. Importantly, this mechanism is purely mechanical and closely parallels nonreciprocal hydrodynamic interactions among active particles, which also arise from tensorial couplings.

      In our system, elastic interactions within the biofilm matrix, combined with pilus-generated active forces, provide a natural origin for nonreciprocal interactions. To further validate this, we improved our imaging to record single-cell dynamics both at the colony edge and on the biofilm surface. (new supplementary Video). These experiments show that motile bacteria at the leading edge of the biofilm structure do not generate waves, whereas stationary bacteria within the biofilm display local oscillations within the elastic network. This observation supports the view that collective oscillations are a property of the elastic biofilm state rather than of freely motile cells.

      Moreover, the main control parameter for these oscillations is the ratio between elastic strength and the active force generated by pili. In the active solid model, this ratio is captured by the parameter π and alpha terms. Experimentally, we can tune this ratio simply by adding or removing water from the biofilm, thereby modulating its elasto active coupling. We further motivated the controllability of this feature experimentally. We let the plate dry nonuniformly and observed that the transition between spiral target and plane waves could emerge spontaneously across the plate (see Figure 3a). This observation also states the importance of moisture in the biofilm. Starting from this point we established the connection between experimental observation and modelling. In our new simulations we also noticed that the transition from spiral to target wave is particularly driven by merging processes of different topological charges +/- 1 spiral pairs. This critical point was also confirmed by modelling which links the process to elasto active coupling. Further we supported our claim by imagining the edge and the biofilm structure. These new results clarify that elastic structure of the biofilm is critically important (Supplementary Figure 3). We have clarified this mechanistic link in the revised manuscript and rewritten the relevant sections to make this connection explicit.

      Modification in the manuscript:

      “To gain deeper insight into the mechanisms underlying wave formation, we imaged the dynamics of individual bacteria from the fingering regions toward the center of the biofilm. This distinction is critical because, unlike the biofilm center, the edges do not generate waves. We observed that bacteria near the fingering regions remain motile and exhibit collective flow. In contrast, bacteria at the biofilm center are surface-attached and undergo periodic lifting motions. This behavior strongly resembles Mexican-wave dynamics.”

      “We further found that the central region of the biofilm is mechanically more elastic, whereas the edge regions—where wave formation is absent—are motile. These observations suggest that gradual biofilm maturation is a key factor that transforms motile bacteria into a periodically moving but spatially constrained state. Consistent with this picture, the PAO1 strain, which has a strong biofilm-forming capability, completely suppresses surface oscillations. In contrast, the PA14 strain exhibits intermediate behavior, sustaining a partial transition between motile and locally constrained dynamics. Remarkably, signatures of this transition and wave generation are already detectable at the earliest stages of finger formation.”

      Strengths:

      The report of mechanical waves in bacterial collectives. The mechanism has potential application in a multicellular context, such as morphogenesis.

      We thank the reviewer for the positive assessment and for highlighting this potential broad impact of our findings.

      Weaknesses:

      My most serious concern is about left-right symmetry breaking. I fail to see how the data in Figure 6 shows LR symmetry breaking. All they show is in-out directionality, which is a boundary condition. LR SM means breaking of mirror symmetry - the pattern cannot be superimposed on its mirror image using only rigid body transformations (translation and rotation) - as far as I am aware, this condition is not satisfied in this pattern-forming system.

      We thank the reviewer for pointing out this critical issue. We acknowledge that we overlooked the distinction between biological and physical definitions of left–right symmetry in our initial submission, and we agree that our terminology was confusing.

      In developmental biology, the term “left–right symmetry breaking” is often used to describe asymmetric flows generated by nodal cilia, which subsequently establish developmental asymmetry. This usage differs fundamentally from the physical definition of mirror symmetry breaking, which refers to chirality switching upon mirror reflection. As the reviewer correctly noted, our system does not exhibit mirror symmetry breaking in this strict physical sense.

      To avoid confusion, we have revised the manuscript and replaced the term left–right symmetry breaking with left–right asymmetry between the edge and the center of the biofilm. This asymmetry arises from frequency gradients across the biofilm and is not a trivial boundary effect. For circular colonies, this phenomenon is more accurately described as radial asymmetry. We have rewritten the relevant sections of the manuscript to clarify this distinction and prevent misinterpretation.

      Reviewer #2 (Public review):

      Summary:

      This manuscript by Altin et al. examines the dynamics of bacterial assemblies, building on previously published work documenting mechanical spiral waves. The authors show that the emergent dynamics can be influenced by various factors, including the strain of bacteria and water content in the sample. While the topic of this paper would be of broad interest, and the preliminary results are certainly interesting, various aspects of this paper are underdeveloped and require further exploration.

      Strengths:

      One of the nice features of this system is the ability to transition between the different states based on the addition or withdrawal of water. The authors use a similar experimental model system and mathematical model to previously published work (Reference 49), but extend by showing that the behaviour can be modified through simple interventions. Specifically, the authors show that adding water droplets or drying the sample through heating can result in changes in the observed wave structure. This represents a possible way of controlling active matter.

      The mathematical model proposed in this paper involves a phase-oscillator model of Kuramotostyle coupling (similar to previously reported models). A non-reciprocal phase lag is introduced in order to facilitate the patterns seen in experiments. The qualitative agreement in the behaviour is quite striking, showing both spiral waves and travelling waves.

      We thank the reviewer for the positive assessment and for pointing out areas that required further development. The reviewer is correct that our work builds on previously reported bacterial spiral wave systems; however, there are several significant differences that we now emphasize more clearly in the revised manuscript.

      First, our study involves a different bacterial species and reveals a distinct dynamical process: the waves we report are strictly localized on the surface of the biofilm, in contrast to the bulk oscillations detected through density fluctuations in the earlier work (Ref. 49). The surface waves in our system resemble “Mexican wave”-like motions, in which surface bacteria periodically lift upward. To highlight this key distinction, we performed new imaging experiments that directly visualize this process. (New Video 5 and 6, Author response image 1).

      Second, we systematically compared different bacterial strains, including pathogenic species such as P. aeruginosa PA14 and PAO1, alongside our BSL-1 strain. This comparative approach demonstrates that the observed phenomenon spans strains with different pathogenicity levels, and genetic variations while also showing that our strain provides a safer and more broadly usable model system for laboratory investigations.

      Third, the modeling frameworks differ. Whereas the referred study relied primarily on phase models similar to those used in cilia systems, we combine a delayed Kuramoto-style oscillator model with an active solid model. This combination provides both a phenomenological description and a physical interpretation of the collective dynamics. We acknowledge that, in the original submission, the physical interpretation of the model in relation to our experimental system was underdeveloped. In the revision, we have now established this link explicitly through the elasticity and elasto active coupling of the biofilm. Specifically, we show that the transition from motile to biofilm states is accompanied by changes in elasticity, which directly influence the observed transitions between different types of wave defects. This connection is consistent with prior theoretical works and has even been only studied in robotic active matter systems.

      Together, these clarifications and new results reinforce the novelty of our findings and establish a stronger connection between the experiments and the modeling framework.

      Author response image 1.

      Comparison between the elastic biofilm core and the motile colony edge. Highresolution video recordings revealing individual bacterial motion highlight the key physical differences driving wave-generating. Time-lapse snapshots show that bacteria at the colony edge move freely and form fingering structures, whereas bacteria in the elastic central biofilm periodically lift vertically, producing a Mexican-wave–like collective motion across the surface. See new Video

      Weaknesses:

      The principal observation of the paper - that spiral waves emerge in these systems and can be controlled in various ways - is not linked to microscale dynamics at the cell level. It is recognised that hydrodynamics can introduce non-reciprocity, an essential ingredient of this model. However, in this work the authors have not identified a physical mechanism for the lag, e.g., either through steric interactions or hydrodynamic disturbances. This is also relevant in the phase oscillator modelling section. In low Reynolds number flows, dynamics are instantaneously determined. In this light, what does the phase lag term represent?

      The reviewer is correct that, at low Reynolds numbers, fluid dynamics are instantaneous and do not generate real temporal delays. However, nonreciprocity in hydrodynamic interactions can still emerge from the tensorial structure of the Blake–Oseen Green’s function. In this formalism, the effective asymmetry can be represented mathematically as a phase-lag–like term. This has been theoretically demonstrated in Ref.40. While this is not a literal time delay, it functions analogously by breaking odd symmetry in the coupling.

      In our system, strong long-range hydrodynamic interactions are absent, as the bacteria are embedded in an elastic biofilm matrix. Instead, the dominant interactions are active elastic couplings mediated by pili and biofilm structure. The elastic solid model behaves in a way that is conceptually similar to the hydrodynamic case: pili-induced deformations of the elastic medium produce anisotropic stresses that play a role analogous to the tensorial hydrodynamic Green’s function. Thus, the phase-lag term in our Kuramoto-based model can be interpreted as an effective representation of these nonreciprocal elastic interactions.

      We have clarified this point in the revised manuscript by explicitly connecting the phenomenological phase-lag term to the underlying elastic coupling in biofilms.

      What is the origin of the coupling term, b? Can this be varied systematically or derived from experimental measurements or parameters?

      The term b represents the enhanced elasto-active coupling of the pili process. The length of the Pili varies, and the elongated Pili has more potential to modulate the coupling between bacteria which is known to depend on a critical threshold. This process resembles the pinning dynamics and is driven by the activity of molecular motors within the pili machinery. However, the detailed mechanisms that set the effective coupling strength remain highly complex and are not yet fully understood.

      At present, we do not have a direct way to systematically manipulate b in experiments. A major technical limitation is the nanoscale nature of type IV pili: these protein assemblies are extremely small and difficult to monitor or manipulate directly. Even basic tools such as GFP-based labeling have proven challenging to implement, which restricts our ability to track the detailed dynamics of these structures in live biofilms.

      While we cannot currently derive b directly from experimental parameters, we emphasize in the revised manuscript that b should be understood as an effective parameter capturing the excitability of pili retractions. We also highlight this limitation and note that future advances in molecular imaging and manipulation of pili will be essential for quantitatively linking b to microscopic processes.

      Classification of wave properties is an important aspect of this paper, but is not accomplished in a quantitative sense. What is the method for distinguishing between travelling and spiral waves? There is a range of quantitative tools that could be used to investigate these dynamics (and also compare quantitatively with the models). For example, examining the correlation functions and order parameters could assist with the extraction of wave features (see extensive literature on oscillator models).

      We thank the reviewer for emphasizing this important point. In the revised manuscript, we have incorporated the classic Kuramoto order parameter (S) to characterize the dynamics in our model simulations. However, this metric is not directly applicable to our experimental system, because we cannot resolve the phase of individual bacteria at large scales.

      Instead, we have focused on a flux-based parameter, as previously used in Ref. 40, which can be measured experimentally from collective surface dynamics. Interestingly, we find that the directional flux extracted from our experimental movies closely matches the trends predicted by the model order parameter. We suspect that this similarity arises from the combination of our optical illumination method and the characteristic surface modulations of the biofilm. While we currently lack a rigorous theoretical justification for this correspondence, so we want to keep this discussion in the review document.

      In summary, we now use the classic Kuramoto order parameter in simulations and rely on the experimentally accessible flux measure for our experimental data. This dual approach allows us to compare model and experiment in a consistent manner.

      Author response image 2.

      Critical order parameters of the coupled biofilm system. (a) The Kuramoto global order parameter increases continuously as the system becomes globally synchronized. In contrast, in the nonreciprocally coupled system the order parameter saturates at a critical level. (b) In the experimentally observed biofilm, however the flux generated by the coupled oscillations provides a more appropriate measure of synchronization. Blue curves indicate directionally propagating planar waves, red curves correspond to spiral wave formation, and green curves represent the globally synchronized reciprocal system.

      Author response image 3.

      Comparison of flux profiles of the simulations with experimental measurements. Directional optical illumination enhances the flux term on the surface of the biofilm.

      The methodology of changing the dynamics through moisture content appears to be slightly underdeveloped, e.g., adding water involves a droplet, and removing water is accomplished by heating (which presumably could cause other effects). Could the dynamics not be controlled more directly by varying the humidity?

      We thank the reviewer for this valuable suggestion. Our results indicate that water content in the biofilm plays a key role in driving the transition to the biofilm state by modulating its elasticity. During the initial submission, we did not know how to systematically vary humidity without simultaneously altering temperature. Standard approaches typically involve water evaporation in controlled chambers, which inherently changes both parameters.

      Following the reviewer’s recommendation, we first measured the ambient moisture levels inside closed culture plates. To our surprise, the relative humidity was already ~98%, leaving virtually no room to increase it further. We then attempted to decrease humidity by flowing dry synthetic air, but even under these conditions we could not reduce it below ~85%, and achieving this required unrealistically high flow rates. Moreover, we noticed that in closed-lid NGM plates, evaporation is already substantial, and when the lid is left open the evaporation rate reaches ~1 µm/s. This rapid surface thinning severely limits the quality of long-term time-lapse imaging.

      Taken together, these technical constraints explain why we have to reliy on localized perturbations such as water droplets and heating rather than global humidity control. We have clarified this point in the revised manuscript and now explicitly discuss both the challenges and limitations of humidity-based approaches.

      At the same time, the authors also mention that temperature itself plays a role in shaping the behaviour. What is the mechanism for this? Is it just through evaporation? Since the frequency increases with temperature, could it just be that activity increases with temperature?

      We thank the reviewer for raising this critical point. We believe that temperature has two distinct impacts operating on different timescales.

      Short timescale (~minutes): We observed that biofilm oscillations respond to temperature changes very rapidly and in a reversible manner. This timescale is too short to be explained by modulation of water content or bulk elasticity of the biofilm. Instead, we attribute the immediate frequency increase to enhanced biological activity of the bacteria at elevated temperatures.

      Long timescale (~tens of minutes to hours): During processes such as the transition from planar to spiral waves, prolonged heating can significantly alter the biofilm structure. These changes are not reversible and likely involve modifications of elasticity and other structural properties.

      In the modeling framework, the short-timescale effect is represented as an increase in the active force term, while the long-timescale effect is captured by concurrent changes in both the active force and the elastic properties of the biofilm. We have clarified this mechanism and its representation in the revised manuscript.

      Reviewer #3 (Public review):

      Summary:

      This manuscript presents a novel investigation into unidirectionally propagating waves observed on the surface of Pseudomonas nitroreducens bacterial biofilms. The authors explore how these waves, initially spiral in form, transition into combinations of spiral, target, and planar patterns. The study identifies the periodic extension-retraction cycles of type IV pili as the driving mechanism for wave propagation, which preferentially moves from the colony's edge to its center. Furthermore, the manuscript proposes two theoretical models-a phase-oscillator model and a continuum active solid model-to reproduce these phenomena, and demonstrates how external manipulations (e.g., water droplets, temperature, PEG) can control wave patterns and direction, often correlating with oscillation frequency gradients. The work aims to bridge the fields of activematter physics and bacterial biophysics by providing both experimental observations and theoretical frameworks for understanding these complex biological wave phenomena.

      We thank the reviewer for the positive assessment of our work and for highlighting both the novelty and the key contributions of our study.

      Strengths:

      The experimental discovery of unidirectionally propagating waves on bacterial biofilms is highly intriguing and represents a significant contribution to both microbiology and active-matter physics.

      The detailed observations of wave pattern transitions (spiral to target to planar) and their response to various environmental perturbations (water, temperature, PEG) provide valuable empirical data. The identification of type IV pili as the driving force offers a concrete biological mechanism. The observed correlation between frequency gradients and wave direction is a compelling finding with potential for broader implications in understanding biological pattern formation. This work has the potential to stimulate further research in the collective behavior of living systems and the physical principles underlying biological organization.

      We thank the reviewer once again for emphasizing the importance of wave directionality. We also believe that this phenomenon may provide insight into early symmetry-breaking processes observed in developmental biology, where oxygen or nutrient gradients in dense environments could play a similar role.

      Weaknesses:

      The manuscript attempts to link unidirectional wave propagation to non-reciprocal couplings but ultimately shows that the wave direction is determined by the gradient of the oscillation frequency. The couplings in the two theoretical models are both isotropic and thus cannot dictate the wave direction. A clear distinction should be made between non-reciprocity as a source of wave generation and non-uniformity as a controlling factor of wave direction.

      We greatly appreciate the reviewer’s careful evaluation, particularly for highlighting this important and often confusing distinction. The relationship between nonreciprocity, spontaneous symmetry breaking, and frequency gradients has also been a challenging concept for us and required significant effort to clarify.

      Recent theoretical studies have established that traveling wave formation requires nonreciprocity, which provides a framework for understanding phenomena ranging from spiral to target and planar waves. In our system, nonreciprocity arises between the displacement field (U) and the pili force vector (P): as a result in broken phase U effectively “chases” P, breaking PT symmetry locally and thereby enabling the generation of local directional flux and traveling waves. In this sense, nonreciprocity is essential for travelling wave generation and spontaneous symmetry breaking in either direction.

      However, we now agree that global directionality (always from right to left, or edge to center) is set by an independent factor—namely, the oscillation frequency gradient across the biofilm. Thus, while nonreciprocity determines whether waves can travel, frequency gradients determine the large-scale direction in which they propagate. Put differently, PT symmetry is already broken spiral waves due to nonreciprocity, but global asymmetry (frequency gradients) is required to align the overall propagation in one direction.

      We have clarified this distinction in the revised manuscript, emphasizing that nonreciprocity is a necessary ingredient for travelling wave generation, whereas global asymmetry controls global wave direction.

      Modification in the manuscript:

      “We should note that traveling waves indicate broken PT symmetry between these fields triggered by nonreciprocity, with spiral waves serving as a classic signature of this phenomenon. A further transition from spiral to planar waves reflects an overall asymmetry in the frequency profile, which is not directly related to PT-symmetry breaking.”

      The relationship between the phase oscillator model and the active solid model is unclear. Given that U and P are both dynamical variables evolving in three-dimensional space, defining the phase Φ precisely in the phase space spanned by U and P could be challenging. A graphical illustration of the definition of Φ would be beneficial. To ensure reproducibility of the numerical results, the parameter values used in the numerical simulations and an explicit definition of the elastic force in the active solid model should be provided.

      We agree with the reviewer that the relationship between the phase oscillator model and the active solid model can be confusing, but establishing this link is essential to connect different modeling approaches in the literature. As the reviewer notes, in a fully three-dimensional setting with freely moving bacteria, defining the oscillation phase (Φ) in the phase space spanned by U and P is indeed complicated.

      However, our recent imaging results show that bacteria within the biofilm do not undergo large translational motions but instead exhibit periodic “Mexican wave”-like oscillations. These oscillations are confined to a restricted phase space, which allows us to define Φ in a straightforward way. In this context, the phase oscillator model becomes a natural reduction of the dynamics.

      Similarly, in the active solid (or active gel) model, we can plot not only the displacement and force vectors but also the local phase, which shows strong agreement with the phenomenological Kuramoto-style model. To make this connection clearer, we have now included a schematic illustration in the revised manuscript that explicitly shows how Φ is defined in the reduced phase space, and we provide the parameter values used in the simulations as well as the explicit definition of the elastic force in the active solid model to ensure reproducibility.

      The link between the theoretical models and experimental results is weak. For example, the propagation of the kink from the lower to the higher part of the surface (Figure 1e) could be addressed within the framework of the active solid model. The mechanism of transition from spiral to target waves (Figure 3a), b)) requires clarification, identifying which model parameter is crucial for inducing this transition. The wave propagation toward the lower frequency side is numerically demonstrated using the phase oscillator model, but a physical or intuitive explanation for this phenomenon is missing. Also, the wave transitions induced by the addition of water droplets and temperature rise are not linked to specific parameters in the theoretical models.

      We thank the reviewer for highlighting this important weakness, which was also consistently noted by the other reviewers. We fully agree that the link between our theoretical models and experimental results required significant strengthening.

      With improved imaging in the revised study, we were able to uncover additional connections that help establish this link more clearly. We acknowledge that our ability to measure detailed biofilm parameters is limited, which restricts us from providing fully quantitative mappings. Nonetheless, based on the reviewers’ suggestions, we carried out additional imaging and simulations to compare bacterial dynamics at the colony edge and within the biofilm surface. These data confirm that cells within the biofilm undergo restricted, “Mexican wave”-like oscillations, emphasizing the critical role of elasticity in governing the collective dynamics.

      Experimentally, we found that adding water or PEG, or alternatively inducing drying, strongly modulates the effective elasticity of the biofilm. Within the active solid framework, elasticity and the elasto-active coupling are the key parameters controlling the system. By tuning these parameters in simulations, we could reproduce the qualitative transitions observed experimentally. Specifically, we observed that:

      At low elasticity, topological defects are mobile and can move, merge, or annihilate, leading to the emergence of planar waves.

      At high elasticity, defects remain pinned, across the biofilm surface, dominating the dynamics.

      These observations suggest that the motility of defects is the crucial parameter governing the transition between spiral, target, and planar waves. Although we cannot independently manipulate each parameter in experiments, varying the moisture content provides an effective and experimentally accessible control.

      Finally, our simulations and new analyses reveal that spiral defect cores can move and merge to form target waves or annihilate entirely—processes that we also observe experimentally. This rich dynamical behavior underscores the importance of elasticity in shaping pattern transitions, and we believe it warrants further theoretical exploration. We have clarified this connection and its implications in the revised manuscript.

      First, we compare defect dynamics in both Kuramoto-based simulations and the active solid model. Both systems exhibit similar defect-survival behavior. As shown in the review , pairs of unlike (+/−) defects can stably persist only at high nonreciprocity. We further quantify this behavior by plotting the separation distances between unlike defect pairs and find that short-range defect separations are possible exclusively in the high-nonreciprocity regime Supplementary Figure 11.

      This high-nonreciprocity regime corresponds to the dry biofilm state. Increasing moisture reduces elasticity, leading to the loss of stable defect dynamics and promoting the annihilation of unlike defect pairs, which in turn drives the system toward target-wave formation and ultimately planar waves. Conversely, heating the biofilm removes water, enhances elasticity, and increases the system’s ability to sustain closely separated defect pairs.

      Experimentally, we further observe that removing water by heating enhances surface nonuniformities, which readily trigger defect-pair formation. To investigate this mechanism, we performed additional simulations in which local nonuniformities were introduced Supplementary Figure 12. Consistent with experiments, defect-pair generation occurs only at high nonreciprocity, where pairs of unlike defects can be stably maintained. Experimental observation (Author response image 4) also show that surface nonuniformities on the biofilm surface similarly trigger the formation of closely separated defect pairs. We have updated the details of the defect dynamics in the revised manuscript to clarify the transition between these waves.

      Author response image 4.

      Experimental observation showing that small surface nonuniformities on the biofilm surface trigger the formation of closely separated defect pairs. Arrows indicate the position of the nonuniformities

      Modification in the manuscript:

      Defect dynamics controlling the transition between spiral to target waves

      “To better understand the dynamics of the transition between different form of the waves we focused on numerical simulations. We noticed that the motility of defects is the crucial parameter governing the transition between spiral, target, and planar waves varying the moisture content provides an effective and experimentally accessible control this motility. Our analyses revealed that spiral defect cores can move and merge to form target waves or annihilate entirely—processes that we also observe experimentally. This rich dynamical behavior underscores the importance of elasticity in shaping pattern transitions. First, we compare defect dynamics in both Kuramotobased simulations and the active solid model. Both systems exhibit similar defect-survival behavior. As shown in Supplementary Figure10, pairs of unlike (+/−) defects can stably persist only at high nonreciprocity. We further quantify this behavior by plotting the separation distances between unlike defect pairs and find that short-range defect separations are possible exclusively in the high-nonreciprocity regime (Supplementary Figure11). This high-nonreciprocity regime corresponds to the dry biofilm state. Increasing moisture reduces elasticity, leading to the loss of stable defect dynamics and promoting the annihilation of unlike defect pairs, which in turn drives the system toward target-wave formation and ultimately planar waves. Conversely, heating the biofilm removes water, enhances elasticity, and increases the system’s ability to sustain closely separated defect pairs. Experimentally, we further observe that removing water by heating enhances surface nonuniformities, which readily trigger defect-pair formation (Supplementary Video9). To investigate this mechanism, we performed additional simulations in which local nonuniformities were introduced (Supplementary Video12-13). Consistent with experiments, defect-pair generation occurs only at high nonreciprocity, where pairs of unlike defects can be stably maintained. Experimental observation (Supplementary Video9) also show that surface nonuniformities on the biofilm surface similarly trigger the formation of closely separated defect pairs.”

      All the recommended points have been addressed in the revised manuscript.

    1. eLife Assessment

      This important study combines a two-person joint hand-reaching paradigm with game-theoretical modeling to examine whether, and how, reflexive visuomotor responses are modulated by a partner's control policy and cost structure. The study provides a convincing set of behavioral findings suggesting that involuntary visuomotor feedback is indeed modulated in the context of interpersonal coordination. The work will be of interest to cognitive scientists studying the motor and social aspects of action control.

    2. Reviewer #1 (Public review):

      Summary:

      Sullivan and colleagues examined the modulation of reflexive visuomotor responses during collaboration between pairs of participants performing a joint reaching movement to a target. In their experiments, the players jointly controlled a cursor that they had to move towards narrow or wide targets. In each experimental block, each participant had a different type of target they had to move the joint cursor to. During the experiment, the authors used lateral perturbation of the cursor to test participants' fast feedback responses to the different target types. The authors suggest participants integrate the target type and related cost of their partner into their own movements, which suggests that visuomotor gains are affected by the partner's task.

      Strengths:

      The topic of the manuscript is very interesting, and the authors are using well-established methodology to test their hypothesis. They combine experimental studies with optimal control models to further support their work. Overall, the manuscript is very timely and shows important findings - that the feedback responses reflect both our and our partners tasks.

    3. Reviewer #2 (Public review):

      Summary:

      Sullivan and colleagues studied the fast, involuntary, sensorimotor feedback control in interpersonal coordination. Using a cleverly designed joint-reaching experiment that separately manipulated the accuracy demands for a pair of participants, they demonstrated that the rapid visuomotor feedback response of a human participant to a sudden visual perturbation is modulated by his/her partner's control policy and cost. The behavioral results are well matched with the predictions of the optimal feedback control framework implemented with the dynamic game theory model. Overall, the study provides an important and novel set of results on the fast, involuntary feedback response in human motor control in the context of interpersonal coordination.

      Review:

      Sullivan and colleagues investigated whether fast, involuntary sensorimotor feedback control is modulated by the partner's state (e.g., cost and control policy) during interpersonal coordination. They asked a pair of participants to make a reaching movement to control a cursor and hit a target, where the cursor's position was a combination of each participant's hand position. To examine fast visuomotor feedback response, the authors applied a sudden shift in either the cursor (experiment 1) or the target (experiment 2) position in the middle of movement. To test the involvement of partner's information in the feedback response, they independently manipulated the accuracy demand for each participant by varying the lateral length of the target (i.e., a wider/narrower target has a lower/higher demand for correction when movement is perturbed). Because participants could also see their partner's target, they could theoretically take this information (e.g., whether their partner would correct, whether their correction would help their partner, etc.) into account when responding to the sudden visual shift. Computationally, the task structure can be handled using dynamic game theory, and the partner's feedback control policy and cost function are integrated into the optimal feedback control framework. As predicted by the model, the authors demonstrated that the rapid visuomotor feedback response to a sudden visual perturbation is modulated by the partner's control policy and cost. When their partner's target was narrow, they made rapid feedback corrections even when their own target was wide (no need for correction), suggesting integration of their partner's cost function. Similarly, they made corrections to a lesser degree when both targets were narrower than when the partner's target was wider, suggesting that the feedback correction takes the partner's correction (i.e., feedback control policy) into account.

      The strength of the current paper lies in the combination of clever behavioral experiments that independently manipulate each participant's accuracy demand and a sophisticated computational approach that integrates optimal feedback control and dynamic game theory. Both the experimental design and data analysis sound good and the main claim is well supported by the results.

      A future direction would be to investigate how this mechanism is implemented in the CNS and to examine whether the same cooperative mechanism also applies to human-AI interactions.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary

      Sullivan and colleagues examined the modulation of reflexive visuomotor responses during collaboration between pairs of participants performing a joint reaching movement to a target. In their experiments, the players jointly controlled a cursor that they had to move towards narrow or wide targets. In each experimental block, each participant had a different type of target they had to move the joint cursor to. During the experiment, the authors used lateral perturbation of the cursor to test participants’ fast feedback responses to the different target types. The authors suggest participants integrate the target type and related cost of their partner into their own movements, which suggests that visuomotor gains are affected by the partner’s task.

      Strengths

      The topic of the manuscript is very interesting, and the authors are using well established methodology to test their hypothesis. They combine experimental studies with optimal control models to further support their work. Overall, the manuscript is very timely and shows important findings - that the feedback responses reflect both our and our partner’s tasks.

      We thank the reviewer for the positive comments regarding our work.

      Weaknesses

      However, in the current version of the manuscript, I believe the results could also be interpreted differently, which suggest that the authors should provide further support for their hypothesis and conclusions.

      Major Comments

      (1) Results of the relevant conditions:

      In addition to the authors’ explanation regarding the results, it is also possible that the results represent a simple modulation of the reflexive response to a scaled version of cursor movement. That is, when the cursor is partially controlled by a partner, which also contributes to reducing movement error, it can also be interpreted by the sensorimotor system as a scaling of hand-to-cursor movement. In this case, the reflexes are modulated according to a scaling factor (how much do I need to move to bring the cursor to the target). I believe that a single-agent simulation of an OFC model with a scaling factor in the lateral direction can generate the same predictions as those presented by the authors in this study. In other words, maybe the controller has learned about the nature of the perturbation in each specific context, that in some conditions I need to control strongly, whereas in others I do not (without having any model of the partner). I suggest that the authors demonstrate how they can distinguish their interpretation of the results from other explanations.

      We thank the reviewer for the thoughtful comment. While it is possible that the change in the visuomotor feedback responses could be just from a scaling factor. This hypothesis could explain the difference between two conditions, but would fail to explain differences between two other conditions. Specifically, this hypothesis could explain a decrease in involuntary visuomotor feedback responses between partner-irrelevant/self-relevant and partner-relevant/self-relevant. Critically, this hypothesis could not explain the difference between partner-irrelevant/self-irrelevant and partner-relevant/self-irrelevant. That is, there is no reason to scale a response to correct for a partner’s relevant target when your own target is irrelevant. However, our finding that there is a greater involuntary visuomotor feedback response in partner-relevant/self-irrelevant compared to partner-irrelevant/self-irrelevant is predicted by the notion that humans form a representation of others and consider their movement costs.

      We have added a paragraph in the discussion to justify our hypothesis over the scaling factor hypothesis.

      “Our hypothesis that the sensorimotor system uses a representation of a partner and considers the partner’s costs to modify involuntary visuomotor feedback responses can parsimoniously explain all of our experimental findings. There are a few alternative hypotheses that could explain a subset of results. One alternative hypothesis is that participants simply learned the hand to center cursor mapping in each experimental condition. That is, instead of using a model of their partner, participants simply adapted to the dynamics of the center cursor. However, this hypothesis would not predict an increased involuntary visuomotor feedback response in the partner-relevant/self-irrelevant condition compared to the partner-irrelevant/self-irrelevant condition. If participants did not form a model of their partner nor consider their partner’s costs, then they would not display an increased feedback response when they had an irrelevant target and their partner’s target was relevant. An increased feedback response to help a partner achieve their goal is captured by our hypothesis that the sensorimotor system uses a representation of a partner and considers the partner’s costs to modify involuntary visuomotor feedback responses.”

      (2) The effect of the partner target:

      The authors presented both self and partner targets together. While the effect of each target type, presented separately, is known, it is unclear how presenting both simultaneously affects individual response. That is, does a small target with a background of the wide target affect the reflexive response in the case of a single participant moving? The results of Experiment 2, comparing the case of partner- and self-relevant targets versus partner-irrelevant and self-relevant targets, may suggest that the system acted based on the relevant target, regardless of the presence and instructions regarding the self-target.

      We thank the reviewer for bringing up another valid point, which we discussed at length as a group when designing the experiment. The reviewer is correct in pointing out the lack of difference in the involuntary epoch between the partner-relevant/self-relevant and partner-irrelevant/self-relevant could potentially suggest that the sensorimotor system acted based on only relevant targets, irrespective if it was a self or partner relevant target. While the effect of the simultaneous presentation of a narrow and wide target on an individual’s response by themselves is unknown, comparing the differences between our other experimental conditions control for this potential confound. Participants viewed a wide target and a narrow target on the screen, in both the partner-irrelevant/self-relevant condition and the partner-relevant/self-irrelevant condition. Crucially, we found that the visuomotor feedback responses were greater in the partner-irrelevant/self-relevant condition compared to the partner-relevant/self-irrelevant condition in both Experiment 1 and 2. That is, participants were able to distinguish between the self-target and partner target and appropriately modify their feedback responses in both Experiment 1 and 2, despite there being both a wide and narrow target on the screen in both conditions. Given that we found different visuomotor feedback responses between the two conditions that had both a narrow and wide target, this rules out the alternative hypothesis that the sensorimotor system acted based just on a relevant target being present. We have added to our discussion to clarify this point.

      “Another alternative hypothesis would be that the sensorimotor system was responding only to the relevant target displayed on the screen. Again, this hypothesis would only explain a subset of our results. In particular, this relevant target hypothesis cannot explain the observed feedback response differences between the partner-relevant/self-irrelevant and partner-irrelevant/self-relevant conditions in both Experiments 1 and 2.”

      (3) Experiment instructions:

      It is unclear what the general instructions were for the participants and whether the instructions provided set the proposed weighted cost, which could be altered with different instructions.

      Our instructions explicitly informed participants that their performance bonus was only based on them stabilizing within their own self-target within the time constraint. We have added the following in the methods to emphasize this instruction.

      “In other words, we ensured participants had a clear understanding that their performance in the task was only based on stabilizing the center cursor in their own self-target within the time constraint. Therefore, the instructions and timing constraints did not enforce participants to work together.”

      (4) Some work has shown that the gain of visuomotor feedback responses reflects the time to target and that this is updated online after a perturbation (Cesonis & Franklin, 2020, eNeuro; Cesonis and Franklin, 2021, NBDT; also related to Crevecoeur et al., 2013, J Neurophysiol). These models would predict different feedback gains depending on the distance remaining to the target for the participant and the time to correct for the jump, which is directly affected by the small or large targets. Could this time be used to target instead of explaining the results? I don’t believe that this is the case, but the authors should try to rule out other interpretations. This is maybe a minor point, but perhaps more important is the location (&time remaining) for each participant at the time of the jump. It appears from the figures that this might be affected by the condition (given the change in movement lengths - see Figure 3 B & C). If this is the case, then could some of the feedback gain be related to these parameters and not the model of the partner, as suggested? Some evidence to rule this out would be a good addition to the paper - perhaps the distance of each partner at the time of the perturbation, for example. In addition, please analyze the synchrony of the two partners’ movements.

      (1) Time to target and forward position

      The reviewer raises an interesting point. In our task, the cursor/target jump occurs once the center cursor crosses 6.25 cm from the start. We analyzed the time it took for the center cursor to intercept the targets from perturbation onset (Supplementary D). In Experiment 1, an ANOVA with center cursor time-to-target as the dependent variable showed no main effect of self-target (F[1,47] = 2.45, p = 0.124) or partner target (F[1,47] = 2.50, p=0.120), nor any interaction (F[1,47] = 1.97, p = 0.166). In Experiment 2, an ANOVA with center cursor time-to-target as the dependent variable showed a significant interaction (F[1,47] = 5.87, p = 0.019). Post-hoc mean comparisons showed that only the difference between the partner-irrelevant/self-irrelevant and partner-relevant/self-irrelevant condition was significant (p = 0.006). Given that only one comparison in Experiment 2 showed a difference in time-to-target, we do not believe that time-to-target was a significant driver of the change in involuntary visuomotor feedback responses observed between conditions. While time-to-target is likely a metric the nervous system modifies feedback gains around, our results suggest that the nervous system can also use a partner model to modify feedback gains. We have added a supplemental analysis on time to target

      “Previous work by Česonis and Franklin (2020) showed that time to-target is a key variable the sensorimotor system uses to modify feedback responses. In their experiment, they manipulated the time-to-target of the participant’s cursor, while controlling for other movement parameters (e.g., distance from goal) [1]. When compared to classical optimal feedback control models, they showed that a model that modifies feedback responses based on time-to-target best predicted their results. In our task, it’s possible that the time-to-target could have influenced visuomotor feedback responses, since the distance to the center of the target is greater for a narrow target than a wide target on perturbation trials.”

      “We calculated the time from perturbation onset to the center cursor reaching the forward position of the targets (Supplementary Fig. S5). In Experiment1, an ANOVA with center cursor time-to-target as the dependent variable showed no main effect of self-target (F[1,47]=2.45,p=0.124) or partner target (F[1,47] = 2.50, p=0.120), nor any interaction (F[1,47] = 1.97, p = 0.166). In Experiment2, an ANOVA with center cursor time-to-target as the dependent variable showed a significant interaction (F [1,47] = 5.87, p = 0.019). Post-hoc mean comparisons showed that only the difference between the partner-irrelevant/self-irrelevant and partner-relevant/self-irrelevant condition was significant (p=0.006). Although time-to-target and hand position are important variables for the control ofmovement,[1,2,3] they are likely not driving factors of the different in voluntary visuomotor feedback responses between our experimental conditions.”

      However, it is possible that the participant forward position at perturbation onset could also influence the involuntary feedback response. We show the forward positions at perturbation onset in Supplementary D. Statistical analysis of the forward positions in Experiment 1 showed a main effect of self-target (F[1,47] = 12.72, p < 0.001), main effect of partner target (F[1,47] = 12.82, p < 0.001), and no interaction (F[1,47] = 0.00, P = 0.991). We see the same trend in experiment 2, showing a main effect of self-target (F[1,47] = 12.11, p < 0.001), main effect of partner target (F[1,47] = 12.04, p < 0.001), and no interaction (F[1,47] = 0.00, p = 0.986). The fact that there was no interaction implies that the results could not solely be due to forward position. Nevertheless, given there were main effects, we proceeded to run an ANCOVA on the involuntary visuomotor feedback responses with forward position as a covariate. For experiment 1, we still observed a significant interaction between self and partner target (F[1,47] = 43.14, p < 0.001). Further, we also observed no significant main effect of forward position on the involuntary visuomotor feedback responses. The ANCOVA for Experiment 2 also showed that there was still a significant interaction of self and partner target on the involuntary visuomotor feedback responses (F[1,47] = 9.80, p = 0.002). However, here we did find a significant main effect of the forward position (F[1,47] = 5.06, p = 0.026). Therefore, we ran follow-up mean comparisons with the covariate adjusted means. We found the same statistical trend as reported in the main results. We found significant differences between the partner-irrelevant/self-irrelevant and partner-relevant/self-irrelevant conditions (p = 0.003), partner-relevant/self-irrelevant and partner-irrelevant/self-relevant conditions (p < 0.001), partner-relevant/self-irrelevant and partner-relevant/self-relevant conditions (p < 0.001). We found no significant difference between the partner-irrelevant/self-relevant and partner-relevant/self-relevant conditions (p = 0.381). Given that there was no main effect of forward position in Experiment 1, and that our adjusted mean comparisons in Experiment 2 showed the same trends as the unadjusted mean comparisons in the main manuscript, our results show that the forward position of the participants is not a significant factor in explaining the differences in involuntary visuomotor feedback responses between conditions.

      “Supplementary Fig. 6 shows the participant hand forward position at perturbation onset time for Experiment 1 (A) and Experiment 2 (B). It is possible that the participant forward hand position at perturbation onset time could influence their visuomotor feedback responses. Therefore, we ran an ANCOVA with self-target and partner target as factors, and participant forward hand position at perturbation onset time as a covariate. In Experiment 1, we found no main affect of participant forward hand position on involuntary visuomotor feedback responses (F[1,47] = 1.466, p = 0.228). Further, when including the covariate, we still found a significant interaction between self-target and partner target on in voluntary visuomotor feedback responses (F[1,47]=43.2, p<0.001).”

      “In Experiment 2, we found a significant main effect of participant forward hand position on involuntary visuomotor feedback responses (F[1,47] = 6.73, p = 0.010). We still found a significant interaction between self-target and partner target (F[1,47] = 9.78, p = 0.002). Since we found a main effect of participant forward hand position, we calculated the adjusted means of the involuntary visuomotor feedback responses. We then performed follow-up mean comparisons on the adjusted means of the involuntary visuomotor feedback responses (using emmeans in R). We found the same significant trends as the unadjusted means in the main manuscript. Specifically we found involuntary visuomotor feedback responses to be: significantly greater in the partner-relevant/self-irrelevant condition compared to the partner-irrelevant/self-irrelevant condition (p =0.003),significantly greater in the partner-relevant/self-irrelevant condition compared to the partner-irrelevant/self-relevant condition (p<0.001), significantly greater in the partner-relevant/self-relevant condition compared to the partner-relevant/self-irrelevant condition (p<0.001),and not different between the partner-irrelevant/self-relevant and partner-relevant/self-relevant conditions (p = 0.824).”

      We have also included in the discussion how time-to-target and participant forward hand position are important control variables to consider, and their potential relationship to our findings.

      “Finally, we also considered whether time to target [1,2]. (Supplementary D), participant forward hand position (Supplementary E), or learning [4] (Supplementary G-H) influenced feedback responses, but found that none impacted the observed differences between experimental conditions nor changed our interpretation. Our hypothesis that the sensorimotor system uses a representation of a partner and considers the partner’s costs to modify involuntary visuomotor feedback responses parsimoniously accounts for the differences observed between all conditions.”

      (2) Synchrony

      In our task, participants movements were not self-initiated. We had them begin the movement as soon as they hear an audible tone so that they would begin their movements at as similar a time as possible. We have analyzed the movement onset synchrony between participants within a pair, shown in Supplementary F.

      Supplementary: “We calculated movement onset times at the time that the participants left the start target [8]. We then took the absolute value of the difference between the participants within a pair as a measure of movement onset synchrony. For Experiment 1, an ANOVA with movement onset synchrony as the dependent variable showed no main effect of self-target (F[1,47] = 1.38, p = 0.252), no main effect of partner target (F[1,47] = 0.057, p = 0.813), and no interaction (F[1,47] = 0.45, p = 0.508). For Experiment 2, an ANOVA with movement onset synchrony as the dependent variable showed no main effect of self-target (F[1,47] = 0.07, p = 0.788), no main effect of partner target (F[1,47] = 2.75, p = 0.111), and no interaction (F[1,47] = 2.31, p = 0.142).”

      Further, we have modified our methods to emphasize that participants within a pair generally began their movement at the same time.

      “Instead of self-initiating their movements, we specifically had participants move at the sound of a tone so that the movement onset between participants in a pair was as synchronous as possible (see Supplementary F for movement onset synchrony analysis).”

      Reviewer #1 (Recommendations for the authors):

      (1) Lines 291-292: One study extensively examined cursor and target jump visuomotor on set times and found no difference (Franklin et al., 2016; J Neuroscience), which strongly argues against this interpretation.

      We thank the reviewer for pointing out this work. We have modified the following lines:

      “However, other work by Franklin and colleagues (2016) found no difference in visuomotor feedback response latencies between cursor and target jumps [6].”

      (2) Line 411: What were the instructions regarding partner performance in terms of the reward? Did you explain that individual performance alone will determine the reward?

      As addressed above, we have made the following changes to emphasize the instructions given to participants.

      “In other words, we ensured participants had a clear understanding that their performance in the task was only based on stabilizing the center cursor in their own self-target within the time constraint. Therefore, the instructions and timing constraints did not enforce participants to work together.”

      (3) Line 506: Ten probe trials in each direction is very low. Can this still be in the transition state of the feedback response, rather than at steady state? There are many studies done looking at the learning of visuomotor responses in which changes are still occurring after several hundred trials (e.g., Franklin et al., 2017 J Neurophysiol; Franklin et al., 2008; J Neuroscience). In this experiment, each block only lasts 151 trials total if my calculations are correct. How certain are you that the results are at a steady state and not continuously changing? Perhaps with further experimental experience, the feedback responses would approach the predictions of a different model.

      The reviewer raises an important point. We had run these analyses prior to submitting the manuscript and did not see anything. However, we believe this information is important to include since both we and yourself asked the same question. Specifically, we have analyzed the visuomotor feedback responses over the trials (Supplementary G), which shows little to no learning over time. Additionally, we also found no difference in the visuomotor feedback response trends between the first and second half of trials in each condition (Supplementary H). Therefore, it appears that the sensorimotor system was at steady state behaviour very quickly and we do believe that the feedback responses would approach the predictions of a different model if participants performed more trials. We have added the following

      Supplementary: “Given there were 151 trials and 10 left/right probe trials for each experimental condition, it is possible that completing more trials may have lead to different involuntary visuomotor feedback responses. Therefore, we analysed the in voluntary visuomotor feedback responses over the course of each experimental condition. Visually, involuntary visuomotor feedback responses in neither Experiment 1 (Fig. S8) nor Experiment 2 (Fig. S9) show any consistent learning (see Fig. S10 for statistical analysis). Therefore, it appears participants rapidly formed a partner model based on knowledge of their movement goal to modify their involuntary visuomotor feedback responses.”

      Supplementary: “Supplementary Fig. S10 shows the involuntary visuomotor feedback responses in the first half (A,C) and second half (B,D) for each experimental condition. In Experiment 1, we observed the same statistical results in the first half and second half of trials as the analysis of all trials. That is, we observed a significant interaction between self-target and partner target in the first half (F[1,47] = 37.09, p < 0.001) and second half (F[1,47] = 48.68, p < 0.001) of trials. Follow-up mean comparisons showed the same significant trends as our analysis of all trials in the main manuscript (see Fig. S10A-B).”

      Supplementary: “In Experiment 2, we observed the same statistical results in the first half and second half of trials as the analysis of all trials. That is, we observed a significant interaction between self-target and partner target in the first half (F[1,47] = 9.42, p = 0.004) and second half (F[1,47] = 17.40, p < 0.001) of trials. Follow-up mean comparisons showed the same significant trends as our analysis of all trials in the main manuscript (Fig. S10C-D).”

      Supplementary: “Showing the same involuntary visuomotor feedback response trends across the experimental conditions for the first half, second half, and all trials suggests that the sensorimotor system quickly formed a model of a partner and considered their costs to modify rapid motor responses.”

      We have also added to the discussion:

      “Finally, we also considered whether time to target [1,2] (Supplementary D), participant forward hand position (Supplementary E), or learning [4] (Supplementary G) influenced feedback responses, but found that none impacted the observed differences between experimental conditions nor changed our interpretation.”

      (4) The authors should also discuss some of the prior work which is very relevant to the tasks studied: (Knill, Bondata & Chhabra, 2011, J Neuroscience). There may also be other papers that use this task for visuomotor feedback responses and therefore, should be included.

      We have included the Knill 2011 paper and also Cross 2019 in our discussion:

      “This modification of feedback responses based on a relevant/irrelevant task goal has also been shown in response to visual perturbations [7,8].”

      (5) Lines 301-303: The terms ’relevant’ and ’irrelevant’ here describe different concepts than the ones used in this study. I suggest making a distinction to avoid confusion for the reader.

      We thank the reviewer for pointing out that this is confusing. We’ve made the following changes to improve the clarity:

      “Further, Franklin and colleagues (2008) designed a visual perturbation to be relevant or irrelevant when reaching to the same target, showing greater involuntary visuomotor feedback responses to a relevant visual perturbation compared to an irrelevant visual perturbation [9].”

      (6) Line 459: The reaching movement was quite slow (25cm in about 1.2 seconds). Is this needed to ensure that both participants can complete the movements, given potentially very different start times? Please comment as this is different than many previous studies.

      Participants needed to stabilize the cursor for 500ms in their target within a time constraint of 1400 - 1600 ms. Therefore, they had to reach the target between 900 - 1100 ms (before stabilizing). Additionally, participants did not perform self-initiated movements, but were required to begin their movement as soon as they heard an audible tone. Given that reaction times are ~200ms, participants had ~700 - 900 ms to reach the target, which aligns with previous research (Franklin et al. (2008), Franklin et al. (2012), Nashed et al. (2012)). We have clarified the time constraints of the task in our Methods:

      “They therefore had 700 - 900 ms to first reach the target, since humans generally have response times ~200 ms, and they needed to stabilize within the target for 500 ms (i.e., 1400 - 200 - 500 = 700 ms and 1600 - 200 - 500 = 900 ms). Movement times of 700 - 900 ms are thus consistent with previous human reaching studies [4,9,10].”

      (7) Reference [25] is incomplete

      Thank you for catching this.

      And thank you for the thoughtful and clear review. We feel it has greatly improved the quality and clarity of our manuscript!

      Reviewer #2 (Public review):

      Summary

      Sullivan and colleagues studied the fast, involuntary, sensorimotor feedback control in interpersonal coordination. Using a cleverly designed joint-reaching experiment that separately manipulated the accuracy demands for a pair of participants, they demonstrated that the rapid visuomotor feedback response of a human participant to a sudden visual perturbation is modulated by his/her partner’s control policy and cost. The behavioral results are well-matched with the predictions of the optimal feedback control framework implemented with the dynamic game theory model. Overall, the study provides an important and novel set of results on the fast, involuntary feedback response in human motor control, in the context of interpersonal coordination.

      We thank the reviewer for the kind words!

      Review:

      Sullivan and colleagues investigated whether fast, involuntary sensorimotor feedback control is modulated by the partner’s state (e.g., cost and control policy) during interpersonal coordination. They asked a pair of participants to make a reaching movement to control a cursor and hit a target, where the cursor’s position was a combination of each participant’s hand position. To examine fast visuomotor feedback response, the authors applied a sudden shift in either the cursor (experiment 1) or the target (experiment 2) position in the middle of movement. To test the involvement of partner’s information in the feedback response, they independently manipulated the accuracy demand for each participant by varying the lateral length of the target (i.e., a wider/narrower target has a lower/higher demand for correction when movement is perturbed). Because participants could also see their partner’s target, they could theoretically take this information (e.g., whether their partner would correct, whether their correction would help their partner, etc.) into account when responding to the sudden visual shift. Computationally, the task structure can be handled using dynamic game theory, and the partner’s feedback control policy and cost function are integrated into the optimal feedback control framework. As predicted by the model, the authors demonstrated that the rapid visuomotor feedback response to a sudden visual perturbation is modulated by the partner’s control policy and cost. When their partner’s target was narrow, they made rapid feedback corrections even when their own target was wide (no need for correction), suggesting integration of their partner’s cost function. Similarly, they made corrections to a lesser degree when both targets were narrower than when the partner’s target was wider, suggesting that the feedback correction takes the partner’s correction (i.e., feedback control policy) into account.

      The strength of the current paper lies in the combination of clever behavioral experiments that independently manipulate each participant’s accuracy demand and a sophisticated computational approach that integrates optimal feedback control and dynamic game theory. Both the experimental design and data analysis sound good. While the main claim is well-supported by the results, the only current weakness is the lack of discussion of limitations and an alternative explanation. Adding these points will further strengthen the paper.

      Reviewer #2 (Recommendations for the authors):

      (1) While the current version is already well-written, it would be helpful for readers to further discuss the relationship between the current study and some potentially relevant studies, such as Braun et al. (2009), Ganesh et al. (2014), and Takagi et al. (2017) (2019).

      Thank you for pointing out these papers that we missed, which we now cite appropriately in light of our own work. In particular, we have added the following to our discussion, including Braun et al. (2009) and Takagi et al. (2017) (2019). However, Beckers et al. (2020) showed conflicting results from Ganesh et al. (2014), and since these works are about learning, we feel it is outside the scope of our work.

      “Further, others have shown that the sensorimotor system modifies movement selection according to game-theoretic predictions, [11] and that the sensorimotor system modifies movements using an estimate of the joint goal during human-human interactions [12,13].”

      (2) For an alternative interpretation of the results, one could consider, for instance, that the target’s visual appearance could have served as a contextual cue for learning different movement gains in the lateral direction (e.g., whether the partner corrects the shift might be approximated as a gain change). Although less likely, this alternative account could be tested by simulation and would strengthen the argument.

      This a thoughtful comment, also brought up by Reviewer 1. Here we provide our previous response that addresses this concern. While it is possible that the change in the visuomotor feedback responses could be just from a scaling factor. This hypothesis could explain the difference between two conditions, but would fail to explain differences between two other conditions. Specifically, this hypothesis could explain a decrease in involuntary visuomotor feedback responses between partner-irrelevant/self-relevant and partner-relevant/self-relevant. Critically, this hypothesis could not explain the difference between partner-irrelevant/self-irrelevant and partner-relevant/self-irrelevant. That is, there is no reason to scale a response to correct for a partner’s relevant target when your own target is irrelevant. However, our finding that there is a greater involuntary visuomotor feedback response in partner-relevant/self-irrelevant compared to partner irrelevant/self-irrelevant is predicted by the notion that humans form a representation of others and consider their movement costs.

      We have added a paragraph in the discussion to justify our hypothesis over the scaling factor hypothesis.

      “Our hypothesis that the sensorimotor system uses a representation of a partner and considers the partner’s costs to modify involuntary visuomotor feedback responses can parsimoniously explain all of our experimental findings. There are a few alternative hypotheses that could explain a subset of results. One alternative hypothesis is that participants simply learned the hand to center cursor mapping in each experimental condition. That is, instead of using a model of their partner, participants simply adapted to the dynamics of the center cursor. However, this hypothesis would not predict an increased involuntary visuomotor feedback response in the partner-relevant/self-irrelevant condition compared to the partner-irrelevant/self-irrelevant condition. If participants did not form a model of their partner nor consider their partner’s costs, then they would not display an increased feedback response when they had an irrelevant target and their partner’s target was relevant. An increased feedback response to help a partner achieve their goal is captured by our hypothesis that the sensorimotor system uses a representation of a partner and considers the partner’s costs to modify involuntary visuomotor feedback responses.”

      (3) Another (maybe unlikely) alternative interpretation is that the targets’ visual appearances might have been confusing. One might find that the closed square is common to both targets for the “Partner Relevant Self Irrelevant” and the “Partner Relevant Self Relevant”, and that this might have elicited the response to perturbation in “Partner Relevant Self Irrelevant”. Related to this point, it would be informative to describe how the “cooperative” fast feedback response developed over the course of the experiment, for instance, by comparing behaviors across experimental blocks.

      We have partitioned this question into two responses, relating to visual appearance of the targets and the development (i.e., learning) of visuomotor feedback responses over the course of the experiments.

      (1) Participants confused by visual appearance of the targets.

      We were also concerned that participants might be confused by the targets, and therefore confirmed with participants after the experiment that they correctly understood that the light grey filled rectangle was their own target and the dark grey hollow rectangle was their partners. Furthermore, in the partner-relevant/self-irrelevant, partner-irrelevant/self-relevant, and partner-relevant/self-relevant conditions, there is a small square target in each of the conditions. However, we found that the partner-irrelevant/self-relevant and partner-relevant/self-relevant conditions both elicited significantly greater involuntary visuomotor feedback responses than the partner-relevant/self-irrelevant condition. Thus, participants involuntary visuomotor feedback responses suggest that they correctly formed different representations based on an accurate understanding of the self vs partner target. The other reviewer had related comments about the visual stimuli, which we also address within the discussion.

      “Another alternative hypothesis would be that the sensorimotor system was responding only to the relevant target displayed on the screen. Again, this hypothesis would only explain a subset of our results. In particular, this relevant target hypothesis cannot explain the observed differences between the partner-relevant/self-irrelevant and partner-irrelevant/self-relevant conditions in both Experiments 1 and 2.”

      (2) Comparing feedback responses over time

      We have included the visuomotor feedback responses over each experimental condition in Supplementary G. Notably, we did not find any learning effect, suggesting that the sensorimotor system quickly developed a model of a partner’s behaviour and used that model to modify feedback responses. We have also added a paragraph on learning to our discussion.

      We’ve addressed how learning did not play a role in this study:

      “Finally, we also considered whether time to target [1,2] (Supplementary D), participant forward hand position (Supplementary E), or learning [4] (Supplementary G-H) influenced feedback responses, but found that none impacted the observed differences between experimental conditions nor changed our interpretation.”

      Supplementary: “Given there were 151 trials and 10 left/right probe trials for each experimental condition, it is possible that completing more trials may have lead to different in voluntary visuomotor feedback responses. Therefore, we analysed the in voluntary visuomotor feedback responses over the course of each experimental condition. Visually, involuntary visuomotor feedback responses in neither Experiment 1 (Fig. S8) nor Experiment 2 (Fig. S9) show any consistent learning (see Fig. S10 for statistical analysis). Therefore, it appears participants rapidly formed a partner model based on knowledge of their movement goal to modify their involuntary visuomotor feedback responses.”

      Supplementary: “Supplementary Fig. S10 shows the involuntary visuomotor feedback responses in the first half (A,C) and second half (B,D) for each experimental condition. In Experiment 1, we observed the same statistical results in the first half and second half of trials as the analysis of all trials. That is, we observed a significant interaction between self-target and partner target in the first half (F[1,47] = 37.09, p < 0.001) and second half (F[1,47] = 48.68, p < 0.001) of trials. Follow-up mean comparisons showed the same significant trends as our analysis of all trials in the main manuscript (see Fig. S10A-B).”

      Supplementary: “Supplementary Fig. S10 shows the involuntary visuomotor feedback responses in the first half (A,C) and second half (B,D) for each experimental condition. In Experiment 1, we observed the same statistical results in the first half and second half of trials as the analysis of all trials. That is, we observed a significant interaction between self-target and partner target in the first half (F[1,47] = 37.09, p < 0.001) and second half (F[1,47] = 48.68, p <0.001) of trials. Follow-up mean comparisons showed the same significant trends as our analysis of all trials in the main manuscript (see Fig. S10A-B).”

      Supplementary: “Showing the same involuntary visuomotor feedback response trends across the experimental conditions for the first half, second half, and all trials suggests that the sensorimotor system used a model of a partner based on their goals and considered their costs to modify rapid motor responses.”

      (4) It looks slightly counter intuitive (and therefore interesting) that the participant shows some amount of fast feedback responses in the “Partner Relevant Self Irrelevant” condition, since they were instructed to only consider the self-target. Based on the results, the authors suggest an altruistic feature of the motor system (lines 333-340). It would be helpful to clarify the basis for this interpretation, whether it is formally derived from the game-theoretic framework or represents a more conceptual interpretation. Providing additional explanation that translates the game-theoretic reasoning into more accessible, intuitive terms would help readers better understand and evaluate this claim.

      We are glad the reviewer also finds this result interesting. The reviewer raises an important point that there needs to be a more clear explanation for why we believe this result was found. We have made the following changes to the discussion:

      “Furthermore, this result is predicted by our dynamic game theory models that include the partner’s costs in the self cost function. In other words, a dynamic game theory model that selects feedback gains to minimize both the self and partner cost reflects an altruistic control policy.”

      (5) Please check whether all references are displayed correctly. Some of them (e.g., 25, 65) seemed not correctly shown in the References section.

      We have fixed the citation.

      We thank the reviewer for providing a clear and insightful review. Their comments have significantly improved the manuscript.

      References

      (1) Česonis, J., & Franklin, D. W. (2020). Time-to-Target Simplifies Optimal Control of Visuomotor Feedback Responses. eneuro, 7 (2), ENEURO.0514–19.2020.

      (2) Česonis, J., & Franklin, D. W. (2022). Contextual Cues Are Not Unique for Motor Learning: Task-dependant Switching of Feedback Controllers. PLOS Computational Biology, 18 (6), ed. by Haith, A. M.: e1010192.

      (3) Crevecoeur, F., Kurtzer, I., Bourke, T., & Scott, S. H. (2013). Feedback Responses Rapidly Scale with the Urgency to Correct for External Perturbations. Journal of Neurophysiology, 110 (6), 1323–1332.

      (4) Franklin, S., Wolpert, D. M., & Franklin, D. W. (2012). Visuomotor Feedback Gains Upregulate during the Learning of Novel Dynamics. Journal of Neurophysiology, 108 (2), 467–478.

      (5) Liu, Y., Leib, R., Dudley, W., Shafti, A., Faisal, A. A., & Franklin, D. W. (2025). Partner-Sourced Haptic Feedback Rather than Environmental Inputs Drives Coordination Improvement in Human Dyadic Collaboration. Scientific Reports, 15 (1), 40347.

      (6) Franklin, D. W., Reichenbach, A., Franklin, S., & Diedrichsen, J. (2016). Temporal Evolution of Spatial Computations for Visuomotor Control. The Journal of Neuroscience, 36 (8), 2329–2341.

      (7) Knill, D. C., Bondada, A., & Chhabra, M. (2011). Flexible, Task-Dependent Use of Sensory Feedback to Control Hand Movements. The Journal of Neuroscience, 31 (4), 1219–1237.

      (8) Cross, K. P., Cluff, T., Takei, T., & Scott, S. H. (2019). Visual Feedback Processing of the Limb Involves Two Distinct Phases. The Journal of Neuroscience, 39 (34), 6751–6765.

      (9) Franklin, D. W., & Wolpert, D. M. (2008). Specificity of Reflex Adaptation for Task-Relevant Variability. The Journal of Neuroscience, 28 (52), 14165–14175.

      (10) Nashed, J. Y., Crevecoeur, F., & Scott, S. H. (2012). Influence of the Behavioral Goal and Environmental Obstacles on Rapid Feedback Responses. Journal of Neurophysiology, 108 (4), 999–1009.

      (11) Braun, D. A., Ortega, P. A., & Wolpert, D. M. (2009). Nash Equilibria in Multi-Agent Motor Interactions. PLoS Computational Biology, 5 (8), ed. by Friston, K. J.: e1000468.

      (10) Takagi, A., Ganesh, G., Yoshioka, T., Kawato, M., & Burdet, E. (2017). Physically Interacting Individuals Estimate the Partner’s Goal to Enhance Their Movements. Nature Human Behaviour, 1 (3), 0054.

      (11) Takagi, A., Hirashima, M., Nozaki, D., & Burdet, E. (2019). Individuals Physically Interacting in a Group Rapidly Coordinate Their Movement by Estimating the Collective Goal. eLife, 8 , e41328.

    1. eLife Assessment

      This study addresses an important question and shows how social navigation in homing pigeons can be explained by simple averaging, without requiring any complex cognitive abilities. The evidence, based on a rigorous and systematic comparison of seven models and data on how social routes can be generated from solitary routes, is compelling. The authors should be commended for their willingness to critically re-examine established interpretations.

    2. Reviewer #1 (Public review):

      Summary:

      This study investigates how collective navigation improvements arise in homing pigeons. Building on the Sasaki & Biro (2017) experiment on homing pigeons, the authors use simulations to test seven candidate social learning strategies of varying cognitive complexity, ranging from simple route averaging to potentially cognitively demanding selective propagation of superior routes. They show that only the simplest strategy-equal route averaging-quantitatively matches the experimental data in both route efficiency and social weighting. More complex strategies, while potentially more effective, fail to align with the observed data. The authors also introduce the concept of "effective group size," showing that the chaining design leads to a strong dilution of earlier individuals' contributions. Overall, they conclude that cognitive simplicity rather than cumulative cultural evolution explains collective route improvements in pigeons.

      Strengths:

      The manuscript provides a compelling argument that a simpler hypothesis is necessary and sufficient to explain the findings of a recent study on improvements to pigeon routes, through a rigorous, systematic comparison of seven alternative hypotheses. The authors should be commended for their willingness to critically re-examine established interpretations. The introduction and discussion are broad and link pigeon navigation to general debates on social learning, wisdom of crowds, and CCE.

      Weaknesses:

      The authors' method focuses on trajectory-level average behaviour rather than the fine-scale decision-making processes of organisms. This is acknowledged in the manuscript by the authors.

      Comments on revision:

      The authors have addressed most of the comments by me as well as the other reviewer.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript investigates which social navigation mechanisms, with different cognitive demands, can explain experimental data collected from homing pigeons. Interestingly, the results indicate that the simplest strategy - route averaging - aligns best with the experimental data, while the most demanding strategy - selectively propagating the best route - offers no advantage. Further, the results suggest that a mixed strategy of weighted averaging may provide significant improvements.

      The manuscript addresses the important problem of identifying possible mechanisms that could explain observed animal behavior by systematically comparing different candidate models. A core aspect of the study is the calculation of collective routes from individual bird routes using different models that were hypothesized to be employed by the animals but which differ in their cognitive demands.

      The manuscript is well written, with high-quality figures supporting both the description of the approach taken and the presentation of results. The results should be of interest to a broad community of researchers investigating (collective) animal behavior, ranging from experiment to theory. The general approach and mathematical methods appear reasonable and show no obvious flaws. The statistical methods also appear.

      Strengths:

      The main strength of the manuscript is the systematic comparison of different meta-mechanisms for social navigation by modeling social trajectories from solitary trajectories and directly comparing them with experimental results on social navigation. The results show that the experimentally observed behavior could, in principle, arise from simple route averaging without the need to identify "knowledgeable" individuals. Another strength of the work is the establishment of a connection between social navigation behavior and the broader literature on the wisdom of crowds through the concept of effective group size.

      Comments on revision:

      The authors made substantial revisions to the manuscript, addressing my comments. While I do think that regarding my second comment on CCE the authors could be a bit more bold, I am overall satisfied with the revisions made.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This study investigates how collective navigation improvements arise in homing pigeons. Building on the Sasaki & Biro (2017) experiment on homing pigeons, the authors use simulations to test seven candidate social learning strategies of varying cognitive complexity, ranging from simple route averaging to potentially cognitively demanding selective propagation of superior routes. They show that only the simplest strategy-equal route averaging-quantitatively matches the experimental data in both route efficiency and social weighting. More complex strategies, while potentially more effective, fail to align with the observed data. The authors also introduce the concept of "effective group size," showing that the chaining design leads to a strong dilution of earlier individuals' contributions. Overall, they conclude that cognitive simplicity rather than cumulative cultural evolution explains collective route improvements in pigeons.

      Strengths:

      The manuscript addresses an important question and provides a compelling argument that a simpler hypothesis is necessary and sufficient to explain findings of a recent influential study on pigeon route improvements, via a rigorous systematic comparison of seven alternative hypotheses. The authors should be commended for their willingness to critically re-examine established interpretations. The introduction and discussion are broad and link pigeon navigation to general debates on social learning, wisdom of crowds, and CCE.

      We thank the reviewer for their positive comments.

      Weaknesses:

      The lack of availability of codes and data for this manuscript, especially given that it critically examines and proposes alternative hypotheses for an important published work.

      We thank the reviewer for their comment. The code and data for our manuscript are an important aspect of the study, and we had intended to make them publicly available upon publication. The link to our code and data on fig share can be found here: (https://doi.org/10.6084/m9.figshare.28950032.v1). We have now revised the manuscript to include a link to our dataset.

      Reviewer #2 (Public review):

      Summary:

      The manuscript investigates which social navigation mechanisms, with different cognitive demands, can explain experimental data collected from homing pigeons. Interestingly, the results indicate that the simplest strategy - route averaging - aligns best with the experimental data, while the most demanding strategy - selectively propagating the best route - offers no advantage. Further, the results suggest that a mixed strategy of weighted averaging may provide significant improvements.

      The manuscript addresses the important problem of identifying possible mechanisms that could explain observed animal behavior by systematically comparing different candidate models. A core aspect of the study is the calculation of collective routes from individual bird routes using different models that were hypothesized to be employed by the animals, but which differ in their cognitive demands.

      The manuscript is well-written, with high-quality figures supporting both the description of the approach taken and the presentation of results. The results should be of interest to a broad community of researchers investigating (collective) animal behavior, ranging from experiment to theory. The general approach and mathematical methods appear reasonable and show no obvious flaws. The statistical methods also appear.

      Strengths:

      The main strength of the manuscript is the systematic comparison of different meta-mechanisms for social navigation by modeling social trajectories from solitary trajectories and directly comparing them with experimental results on social navigation. The results show that the experimentally observed behavior could, in principle, arise from simple route averaging without the need to identify "knowledgeable" individuals. Another strength of the work is the establishment of a connection between social navigation behavior and the broader literature on the wisdom of crowds through the concept of effective group size.

      We thank the reviewer for their positive comments.

      Weaknesses:

      However, there are two main weaknesses that should be addressed:

      (1) The first concerns the definition of "mechanism" as used by the authors, for example, when writing "navigation mechanism." Intuitively, one might assume that what is meant is a behavioral mechanism in the sense of how behavior is generated as a dynamic process. However, here it is used at a more abstract (meta) level, referring to high-level categories such as "averaging" versus "leader-follower" dynamics. It is not used in the sense of how an individual makes decisions while moving, where the actual route followed in a social context emerges from individuals navigating while simultaneously interacting with conspecifics in space and time. In the presented work, the approach is to directly combine (global) route data of solitary birds according to the considered "meta-mechanisms" to generate social trajectories. Of course, this is not how pigeon social navigation actually works-they do not sit together before the flight and say, "This is my route, this is your route, let's combine them in this way." A mechanistic modeling approach would instead be some form of agent-based model that describes how agents move and interact in space and time. Such a "bottom-up" approach, however, has its drawbacks, including many unknown parameters and often strongly simplifying (implicit) assumptions. I do not expect the authors to conduct agent-based modeling, but at the very least, they should clearly discuss what they mean by "mechanism" and clarify that while their approach has advantages-such as naturally accounting for the statistical features of solitary routes and allowing a direct comparison of different meta-mechanisms is also limited, as it does not address how behavior is actually generated. For example, the approach lacks any explicit modeling of errors, uncertainty, or stochasticity more broadly (e.g., due to environmental influences). Thus, while the presented study yields some interesting results, it can only be considered an intermediate step toward understanding actual behavioral mechanisms.

      We thank the reviewer for their comment and thoughtful suggestions. We agree that the inherent behavioral mechanisms and the biological basis of these mechanisms cannot be determined just through the navigational data alone. For instance, it remains unexplored if pigeons are adapting their behavior based only on social cues from their partners or using other navigational features such as landmarks or roads, location of the sun, geomagnetic cues or prior learnt routes. However, we do agree (as also pointed by the reviewer) that these behavioral rules generate an emergent ‘meta-mechanism’ where the bird pairs are behaving as if their preferred routes are averaged during a flight. It will be important in future work to explore the biological basis of these mechanisms, but our current approach allows us to only describe the mechanisms in a meta sense with any confidence. Considering this, we believe that our analysis is a more top-down approach towards describing the outcomes of these underlying mechanisms in an abstract sense. We would also like to point the reviewer to Dalmaijer, 2024 [1] who used a bottom up approach, using naive agents and showed that cumulative route improvements emerged in the absence of any sophisticated communication in the same dataset, in agreement with our approach. We have now added a paragraph: “It is also important to clarify that we use the terms…… that lead to these meta-mechanisms arising remain an open question.” found in lines 120-129 in our Introduction to make this clarification.

      (2) While the presented study raises important questions about the applicability and viability of cumulative cultural evolution (CCE) in explaining certain animal behaviors such as social navigation, I find that it falls short in discussing them. What are the implications regarding the applicability of CCE to animal data and to previously claimed experimental evidence for CCE? Should these experiments be re-analyzed or critically reassessed? If not, why? What are good examples from animal behavior where CCE should not be doubted? Furthermore, what about the cited definitions and criteria of CCE? Are they potentially too restrictive? Should they be revised-and if so, how? Conversely, if the definitions become too general, is CCE still a useful concept for studying certain classes of animal behavior? I think these are some of the very important questions that could be addressed or at least raised in the discussion to initiate a broader debate within the community.

      We thank the reviewer for their comments and interesting questions regarding our study. We agree with the reviewer that our study opens up new avenues for critically analysing the criteria previous studies have used for providing evidence of CCE in non-human animals. According to our literature review, we found that the field has been usually motivated in thinking about CCE in a ‘process’ focused manner (Reindl et al. [2]) in regards to individuals being able to compare strategies and selecting ones resulting in higher individual fitness. This preferential selection of strategies – termed innovations — allows for the stereotypical ratcheting effect seen in CCE. In our study, we propose that in the case of homing pigeons, the ratcheting effect is more of a statistical outcome rather than deliberate individual judgement. We believe that this strategy is also amenable to certain task types (which in our study was homing route choice) and may change for others (for example solving a puzzle box) and the task also needs to be sufficiently complex for animals to benefit from the use of social information (Caldwell et al. 2008 [3]). Thus, we recommend future work to address what classes of problems would fit well within the definition of “emergent” CCE and which ones don’t. Keeping this framework in mind, studies should clearly state what definition of CCE they are using and should be critically evaluated for their underlying task type and cognitive mechanisms to deem them as CCE. Considering these points, we have now expanded our Discussion to include a paragraph: “Our results highlight the need for more…..range of task types and cognitive abilities.” found in lines 420-433 to highlight these key questions.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      I do not have any major objections, but I am clarifying my points as major or minor depending on the effort required to address (mostly via rewriting and clarifications).

      Major comments:

      (1) A schematic summary of the original study: Since the current manuscript builds directly on Sasaki & Biro (2017), it would greatly help readers if you included a concise schematic figure summarizing the original experiment. For instance, a simple panel could depict the chain design (experienced + naïve replacements), the control treatments, and the key empirical findings (improvements in route efficiency across generations, and route similarity within vs. between chains). Presenting this visually would save readers the effort of reconstructing the design and main results from text alone, especially for those unfamiliar with the original paper. It would also clarify exactly what empirical patterns your simulations are intended to reproduce.

      We thank the reviewer for this comment. We have now revised the manuscript with a schematic illustration adapted from the original study by Sasaki and Biro (2017). We hope this clarifies the experimental design and results we aimed to highlight in our work.

      (2) Reproducibility: Code and data are only "available on request." I believe eLife has strong policies on open science; a lack of immediate open access to analysis would be a barrier. I find it jarring that a paper intending to reproduce and improvise a previously published paper does not make the codes and data available for peer review or to readers without an explicit request.

      We have taken the feedback into consideration and updated the Data Availability section with a link to our Fig share dataset.

      (3) One huge drawback of the current format of the manuscript, where Methods come after Results, is that one has to really struggle to understand and appreciate Figures 2 and 3. I would strongly urge authors to have a shorter methods section embedded either as a subsection before the Results, or within the results section, as described in each figure. Perhaps a lot of my confusion also comes from not having known the previous paper, but it may be true for other readers, too. More specifically, for Figure 3, how is social weight for the experiments inferred? Figure 3 caption talks of mean difference, but one has to check the manuscript at multiple places throughout to really understand what this difference is (the definition) and how it is computed.

      While we agree that our manuscript includes the Methods section at the end, we tried to structure our text to tell a story (as stated in our manuscript title). To this end, we organized the text into short titled subsections that briefly convey the relevant background, identify the knowledge gap and outline our approach. We chose this structure to reserve the indepth details about model implementation and statistical analysis for the Methods.

      Additionally, we made sure to include references to methodological details in relevant segments of the Introduction and Results section so as to not bog down the reader by model complexities and keep a coherent narrative that delivers the message of our study. To further address the background of our work, we have now added a schematic of the original study in response to a previous comment by the reviewer, which we hope helps the reader better understand our work. We hope this explanation clarifies the intention behind our writing choice and decision to retain the current structure.

      (4) The introduction of the 'effective group size' concept is a potentially valuable and intuitive way to interpret chain dynamics, but the explanation is somewhat buried in the Results/Methods; I suggest highlighting it more prominently (e.g., in the Discussion or with a schematic in the Results) so readers can readily grasp this useful idea.

      We thank the reviewer that they found our concept of ‘effective group size’ useful. However, we do believe that we introduced the idea and rationale behind using this method in the Results: “We asked to what extent……to an equivalent group size” found in lines 305-314. We reserved a detailed description of this method in the Methods section. However, to further emphasize the importance of the concept we have now added a text: “This is further supported….. slightly better than two individuals.” found in lines 389-394 in the Discussion. 

      Minor comments:

      (1) Line 12: "what is the navigation mechanism(s)" - the (s) is a bit awkward. Either remove (s) or ask what the mechanisms are.

      We have fixed the typo to clarify the statement.

      (2) Line 78: "Such 'ratchet'-like improvements is referred to..." → "are referred to."

      We have fixed the typo to clarify the statement.

      (3) Figure 3 caption: "color scheme in the plots are same" → should be "is the same."

      We have fixed the typo to clarify the statement.

      (4) Clarification on reporting confidence intervals: The manuscript reports confidence intervals (CIs) for the model-based comparisons (e.g., Figures 2-3). This might seem unnecessary for simulation studies, since running more iterations can arbitrarily shrink uncertainty. However, in your case, the CIs are justified because the simulations are anchored to a finite empirical dataset (only 9 solo trajectories), sampled with replacement, and analyzed with mixed-effects models that incorporate bird identity as a random effect. Thus, the intervals reflect biological sample variability rather than simulation noise. This must be clarified.

      We have added a clarifying statement: “...and reflect the biological uncertainty in the empirical dataset, not simulation noise” found in lines 241 and 293 in the captions of Figures 2 and 3 in accordance with the reviewer’s comment. 

      (5) One part of the issue is that details of methods come much later in the manuscript, perhaps following journal style. Therefore, I recommend explicitly highlighting this rationale in the Results, so readers do not misinterpret the CIs as simply reflecting simulation error.

      We believe that the clarifying statements we have now added in the captions of Figures 2 and 3 should convey this interpretation of CIs and further changes in the Results may not be required.

      With these proposed changes we hope that we improved upon the clarity of our manuscript.

      References:

      (1) Dalmaijer ES (2024) Cumulative route improvements spontaneously emerge in artificial navigators even in the absence of sophisticated communication or thought. PLoS Biol. 22:e3002644.

      (2) Reindl, E., Gwilliams, A.L., Dean, L.G. et al. (2020) Skills and motivations underlying children’s cumulative cultural learning: case not closed. Palgrave Commun 6, 106.

      (3) Caldwell CA, Millen AE (2008) Studying cumulative cultural evolution in the laboratory. Phil. Trans. R. Soc. B 363:3529-3539.